Jump to content

Real Time Subtilties to Pasting Program

ahuckphin
Go to solution Solved by ahuckphin,
^F1:: ; Ctrl + F1

MouseClickDrag, Left, 320, 980, 785, 980 ; move move cursor and select the line of text from left to right

sleep 25

Send ^c ; copy selected text 

sleep 25

Send {ALT DOWN}{TAB}{ALT UP} ; switch program

sleep 25

Send {Space}

sleep 25

Send ^+v ; paste clipboard ; paste

sleep 25

Send {ALT DOWN}{TAB}{ALT UP} ; switch program

NewText := Clipboard ; assigns value of clipboard to NewText variable

sleep 25

loop
{
	OldText := NewText ; assigns value of NewText variable to OldText variable
	
	sleep 25
	
	MouseClickDrag, Left, 320, 980, 785, 980 ; move move cursor and select the line of text from left to right

	sleep 25

	Send ^c ; copy selected text 
	
	sleep 25
	
	NewText := Clipboard ; overide the value of NewText variable with clipboard 
	
	sleep 25 
	
	if (OldText = NewText) 
	{
	
	}
	else
	{
		Send {ALT DOWN}{TAB}{ALT UP} ; switch program

		sleep 25

		Send {Space}

		sleep 25

		Send ^+v ; paste clipboard ; paste
	
		sleep 25

		Send {ALT DOWN}{TAB}{ALT UP} ; switch program
	}
	
}

return

^F2::Reload ; ctrl+R to reload the script.

return

I have done it with AutoHotkey. Truly impressed at the capabilities of this program. 

 

Is it perfect? No. I am having problem it sometimes to selecting the whole line of text and sometimes selecting my profile pic + user name too. 

 

Works better if less stressed a.k.a speakers talks slowly

 

Video demonstration: 

1775345267_LTTDemo.mov

As part of my on going attempts to make the most out of online classes effectively and efficiently + my discovery/re-discovery of Microsoft Team's built in live captions, I have been thinking.

 

Does a program currently exists / can I code a program that does something along the lines of this: 

 

  1. Launch the program. Runs in the background
  2. Capture an image of the live captions
  3. Run OCR on that image.
  4. Save characters obtained from OCR as NewCharacterSet
  5. Then loop: 
    1. Save as NewCharacterSet as OldCharacterSet
    2. Capture an image of the live captions
    3. Run OCR on that image
    4. Save characters obtained from OCR as NewCharacterSet
    5. Compared NewCharacterSet to OldCharacterSet. 
    6. If not equal 
      1. Take new character set, insert into Windows clipboard 
      2. At cursor pointer, paste text

 

Alternatively, the live captions can be selected. Is it possible for mouse cursor to move to a particular x,y coordinate and select text opposed to capture + OCR?

 

Perhaps a flow chart would have been better to demonstrate the potential logic. 

 

I am most comfortable programming with visual basic in visual studio 2019 and am aware of Tesseract OCR. Although completely clueless in regards to have e.g. winform constantly "print screen", and how to paste at cursor point. 

 

As I write this, I feel like AutoHotKey has potential. Although I have never tried implementing variables and conditional statements into AutoHotKey scripts before. 


 

Link to comment
Share on other sites

Link to post
Share on other sites

I would basically ...

 

1. Use OBS to record the screen

2. Use Virtualdub or something similar to crop the recorded video so that only the region with subtitles is visible

3. Use virtualdub to apply filters (convert to grayscale / 256 colors for example) and export each frame of the video as BMP / PNG sequence

3b use virtaldub to convert the video to 1-5 fps (a subtitle line will be on the screen for at least 1 second, so you'll see it in one of those frames)

- all three above can be done in one shot in virtualdub -

4. Write a script/program that loads each image at a time and filters the area where the subtitle is to be black or some neutral color, leaving you with just the actual text (white with maybe some antialiasing black / gray around letters )

4b. optionally throw out any picture that's less than 5-10% different between frames (remove some duplicate frames) - change in subtitle will cause enough of a difference, you'll have a few pixels different around the edges of the text due to aliasing if any is used.

5. feed the pictures into vobsub /aegisub / or some subtitle editor (they have ocr engines optimized for subtitle fonts and you can teach the program the character set for higher accuracy / recognition) - i guess you could use tesseract ocr

6. remove duplicate text in the subtitle / clean it up.

 

Something else you may want to consider... the subtitles are probably not "burnt" into the video, so maybe you can use developer tools / network tab and see from where are those subtitles loaded.

If the video is on Youtube, tools like JDownloader can download the video and the subtitles for you.

Link to comment
Share on other sites

Link to post
Share on other sites

^F1:: ; Ctrl + F1

MouseClickDrag, Left, 320, 980, 785, 980 ; move move cursor and select the line of text from left to right

sleep 25

Send ^c ; copy selected text 

sleep 25

Send {ALT DOWN}{TAB}{ALT UP} ; switch program

sleep 25

Send {Space}

sleep 25

Send ^+v ; paste clipboard ; paste

sleep 25

Send {ALT DOWN}{TAB}{ALT UP} ; switch program

NewText := Clipboard ; assigns value of clipboard to NewText variable

sleep 25

loop
{
	OldText := NewText ; assigns value of NewText variable to OldText variable
	
	sleep 25
	
	MouseClickDrag, Left, 320, 980, 785, 980 ; move move cursor and select the line of text from left to right

	sleep 25

	Send ^c ; copy selected text 
	
	sleep 25
	
	NewText := Clipboard ; overide the value of NewText variable with clipboard 
	
	sleep 25 
	
	if (OldText = NewText) 
	{
	
	}
	else
	{
		Send {ALT DOWN}{TAB}{ALT UP} ; switch program

		sleep 25

		Send {Space}

		sleep 25

		Send ^+v ; paste clipboard ; paste
	
		sleep 25

		Send {ALT DOWN}{TAB}{ALT UP} ; switch program
	}
	
}

return

^F2::Reload ; ctrl+R to reload the script.

return

I have done it with AutoHotkey. Truly impressed at the capabilities of this program. 

 

Is it perfect? No. I am having problem it sometimes to selecting the whole line of text and sometimes selecting my profile pic + user name too. 

 

Works better if less stressed a.k.a speakers talks slowly

 

Video demonstration: 

1775345267_LTTDemo.mov

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, mariushm said:

I would basically ...

 

1. Use OBS to record the screen

2. Use Virtualdub or something similar to crop the recorded video so that only the region with subtitles is visible

3. Use virtualdub to apply filters (convert to grayscale / 256 colors for example) and export each frame of the video as BMP / PNG sequence

3b use virtaldub to convert the video to 1-5 fps (a subtitle line will be on the screen for at least 1 second, so you'll see it in one of those frames)

- all three above can be done in one shot in virtualdub -

4. Write a script/program that loads each image at a time and filters the area where the subtitle is to be black or some neutral color, leaving you with just the actual text (white with maybe some antialiasing black / gray around letters )

4b. optionally throw out any picture that's less than 5-10% different between frames (remove some duplicate frames) - change in subtitle will cause enough of a difference, you'll have a few pixels different around the edges of the text due to aliasing if any is used.

5. feed the pictures into vobsub /aegisub / or some subtitle editor (they have ocr engines optimized for subtitle fonts and you can teach the program the character set for higher accuracy / recognition) - i guess you could use tesseract ocr

6. remove duplicate text in the subtitle / clean it up.

 

Something else you may want to consider... the subtitles are probably not "burnt" into the video, so maybe you can use developer tools / network tab and see from where are those subtitles loaded.

If the video is on Youtube, tools like JDownloader can download the video and the subtitles for you.

I guess this can work. 

If I understand your suggestion correctly, this can not be done in real time and if that is correct, it is probably faster to do this -  

 

Link to comment
Share on other sites

Link to post
Share on other sites

8 minutes ago, bindydad123 said:

I guess this can work. 

If I understand your suggestion correctly, this can not be done in real time and if that is correct, it is probably faster to do this -  

 

no, it's not real time. assumption was subtitles were hardcoded in the video and need to be ocr-ed.

 

Could be close to real time I guess...  capture just the region with subtitle, configure obs to record to a folder and switch files every 1-5 minutes incrementing a number in the file (ending up with 1 mp4/mkv file for each minute or each [custom duration] period)

have a script monitor the folder, and as soon as a file is closed and a new one is created by obs, have the script push the video through ffmpeg or something that generates the images to a ram drive or some temporary folder), then batch process the images and ocr them and filter the text.

You could do it easily in python... I'm not a fan of python but could easily do something with php for example, run one or multiple scripts in parallel that pick up completed segments as soon as they're done (as php is single threaded, would run multiple scripts in parallel if the speed is not real time)

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×