How to Install and Use Tesseract OCR on Windows - Optical Character Recognition

  Рет қаралды 128,520

JayMartMedia

JayMartMedia

Күн бұрын

Пікірлер: 98
@augusthyden1077
@augusthyden1077 10 ай бұрын
Holy you should be the standard for youtube tutorials, never experienced such quick and concise tutoring!
@JayMartMedia
@JayMartMedia 10 ай бұрын
Glad you found it helpful!
@RoyelPayne
@RoyelPayne 6 ай бұрын
No kidding, this guy has just set the gold standard. Very good, thanks for sharing!
@MrBendybruce
@MrBendybruce 2 жыл бұрын
Thank you for making this video. I am visually impaired and am currently in the biggest battle of my life to try and save the vision I still have left. This OCR software is a valuable tool in allowing me to be able to read my own physical mail by scanning it into my computer as a JPEG and then converting it into text which can then be read aloud. Thanks again.
@TVDaJa
@TVDaJa 2 жыл бұрын
Keep up keeping up, Bendy
@maxsilon
@maxsilon Жыл бұрын
Good job! Keep you man!!
@clehil
@clehil Жыл бұрын
This tutorial should be the canon online tutorial Windows users of Tesseract. The work you did to eliminate distractions to make the instructions work successfully at the fast and precise rate is very apparent.
@JayMartMedia
@JayMartMedia Жыл бұрын
Thanks! Glad to hear that you found the video helpful!
@yekna459
@yekna459 4 жыл бұрын
By far the best tutorial on Tesseract on youtube. Thanks for uploading
@matthewjohnson3610
@matthewjohnson3610 5 жыл бұрын
Thanks so much. I've been looking for a solution for this for years. Every few months I go looking and have never been able to find anything. I got a bit lucky this time that I stumbled upon Tesseract but couldn't figure out how to get started. This was perfect.
@JayMartMedia
@JayMartMedia 5 жыл бұрын
Great, I'm glad you found this video helpful. Thanks for the encouraging feedback!
@Sodomantis
@Sodomantis Жыл бұрын
02:03 you can just press the crop button and the image size will conform to the image you pasted in. Thanks for video.
@RogerCooley
@RogerCooley 10 ай бұрын
Thank You. A complete presentation. Following your video I was able to extract a few PNG files. I wish you spoke a little slower. I had to stop and rewind segments a few times to understand what you were saying. Please Annunciate. Thanks again.
@vadimcastro332
@vadimcastro332 4 жыл бұрын
thank you sir!! this was very helpful and respectful of the viewer's time, really appreciate it!
@JayMartMedia
@JayMartMedia 4 жыл бұрын
Glad it was helpful!
@falsigo
@falsigo 3 ай бұрын
I could understand without any volume, kudos! Thanks
@aligeovany4645
@aligeovany4645 3 ай бұрын
Jay Mart, by this video, you make me your fan. thanks for sharing this quick, perfect and usefull video.
@shreesingh3137
@shreesingh3137 3 жыл бұрын
*Mind Blowing Video* on *Tesseract-ORC* 🔥🔥
@JayMartMedia
@JayMartMedia 3 жыл бұрын
Glad you found it helpful!
@jacobhadden5407
@jacobhadden5407 Жыл бұрын
That worked great. I am not particularly sophisticated on the cmd prompts but was able to sort it out. The tesseract ocr is very accurate
@JayMartMedia
@JayMartMedia Жыл бұрын
Glad you found it helpful!
@CharlieKelloggPilot
@CharlieKelloggPilot 2 жыл бұрын
Well done. Thankyou for being quick, and to the point.
@matthewjohnson3610
@matthewjohnson3610 5 жыл бұрын
Also in case anybody is wondering, put something like this in a batch file if you want to process a folder of files: for %%X in (*.png) do "tesseract.exe" "%%X" "%%X-ocr"
@betting55555
@betting55555 2 жыл бұрын
Awesome video, good step by step. Thank you!
@MedoHamdani
@MedoHamdani 2 жыл бұрын
Thank you straight to the point, is the video about Python integration available or not yet? Is it possible to process batch of images and is it possible to extract directly from PDF scanned images file. Lastly, is it possible to put them in a GUI? Thanks mate
@LilaGovindaDas
@LilaGovindaDas Жыл бұрын
try restarting your pc if after setting the new environment path cmd still doesnt recognize it. worked for me
@PlanetXtreme
@PlanetXtreme 6 ай бұрын
godsend tutorial maker
@Sameh_Abdel-Qawy
@Sameh_Abdel-Qawy Жыл бұрын
This was very helpful. Thanks a lot! I'd like to know if there is a script to extract text of all the images automatically without select it? Thanks again.
@rudeus8998
@rudeus8998 Ай бұрын
1:45 followed till this then typed "tesseract" in command prompt but it's still saying "tesseract not recognized as internal or external command"
@JayMartMedia
@JayMartMedia Ай бұрын
Double check that the file path for the tesseract executable has been added to the PATH environment variable. Also, you will need to open a NEW command prompt after the file path has been added to the PATH environment variable. This is because the PATH environment variable is loaded when cmd prompt starts, so if PATH is updated while cmd prompt is already open it does not have the new PATH value.
@rudeus8998
@rudeus8998 Ай бұрын
@@JayMartMedia i double checked. Still didn't work
@bishalsharma9238
@bishalsharma9238 Ай бұрын
@@JayMartMedia same goes for me , didn't work
@manhvo242
@manhvo242 6 ай бұрын
0:45 How did you download so fast? I tried to download it but it only doing ~250Kb/s (sometimes it got to 0) I know I have a fast network but it just download so slow
@sahil5124
@sahil5124 Жыл бұрын
thank you so much, it is working
@23498cna
@23498cna 5 ай бұрын
wow, you are absolutely fantastic! Thank you so much!
@JayMartMedia
@JayMartMedia 5 ай бұрын
Thanks for commenting! Glad you found it helpful!
@legitordont
@legitordont Жыл бұрын
Thank you are the best
@samrahmazhar2716
@samrahmazhar2716 4 жыл бұрын
how to give pdf input to tesseract?
@QuranicHealingIN
@QuranicHealingIN 9 ай бұрын
Hi Jay! I want to use Tesseract OCR to convert bulk image to text. Please help!
@CoopAsh
@CoopAsh 7 ай бұрын
When I open the tesseract-result it is opened in Notepad. How do I open it in tesseract like you did?
@JayMartMedia
@JayMartMedia 7 ай бұрын
Are you talking about at 2:22 ? If so, the file is being opened in a text editor called Atom. It's not being opened in Tesseract. Atom is a free text editor, but most people just use VSCode be Microsoft nowadays
@HiPh0Plover1
@HiPh0Plover1 5 жыл бұрын
nice vid , where is the python integration part ?
@guillermocascomiranda
@guillermocascomiranda 2 жыл бұрын
Very interesting video. How can I make it recognize only 100% black text, and discriminate against other colors such as gray, blue, buttons and images?
@JayMartMedia
@JayMartMedia 2 жыл бұрын
I don't see anything in the tesseract documentation specifically about matching color: github.com/tesseract-ocr/tessdoc/blob/main/ImproveQuality.md But one option you could do would be to pass the image through an image processor first to filter out any unneeded color, before passing the image to tesseract. You could do that with manually with a photo editing software such as gimp, or automate it with a script Python script using the opencv library: medium.com/featurepreneur/colour-filtering-and-colour-pop-effects-using-opencv-python-3ce7d4576140 Or use 'convert' from the command line: stackoverflow.com/questions/29742123/remove-all-except-one-colour-from-an-image-commandline-or-code
@guillermocascomiranda
@guillermocascomiranda 2 жыл бұрын
@@JayMartMedia You're right, your help was huge!!! I tried this code "threshold(img,T,255,cv.THRESH_BINARY)" and the filter works, where 'T' is the threshold value, below or above which it turns everything black or white. The problem is that I have thousands of screenshots, and the idea is to automate them in such a way that I don't have to manually edit them with an image editor like Paint, Gimp, Ps, Ai, etc. In addition, the additional problem (and advantage) that I have is that the words ALWAYS appear in the same region of the capture, and I would like it to take ONLY that region, not the surrounding images or buttons. It's just black and gray letters, I need only the black ones, and have them list one below the other in a TXT or CSV file for excel. The captures are digital, from a mobile application, so the quality is very good, centered, and the same typography always, but I don't know how to automate it as much as possible (it extracts and lists in a TXT only the words in black listed in the screenshot)
@JayMartMedia
@JayMartMedia 2 жыл бұрын
This article about using tesseract in python may help, it is definitely a little bit python programming intensive: nanonets.com/blog/ocr-with-tesseract/
@guillermocascomiranda
@guillermocascomiranda 2 жыл бұрын
@@JayMartMedia UR simply a genius. You have no idea how much help me. Kind regards from Argentina. You earned a new subscriber :)
@zidouneca
@zidouneca Ай бұрын
And how to output 1000 image to text in the same time? What is the code for that?
@JayMartMedia
@JayMartMedia Ай бұрын
This video has an example of how to do that with a python script: kzbin.info/www/bejne/fn-mqqOMm8qHmtk
@zidouneca
@zidouneca Ай бұрын
Thank you
@rizaladhi7066
@rizaladhi7066 Жыл бұрын
please share tutorial to find specified image contain text in folder that have 500 image
@lwjunior2
@lwjunior2 Жыл бұрын
Can the program conduct OCR on an entire Folder of images?
@ElBart0oo8
@ElBart0oo8 5 ай бұрын
Hi, thank you so much for the video I found it highly professional. How could you set up Tesseract to continuously extract text from a given portion of the screen?
@JayMartMedia
@JayMartMedia 5 ай бұрын
There may be a better tool to use for this. But in theory you could do something like this by writing a script to capture a screenshot and run it through tesseract every few seconds. It could take a bit of programming though.
@cindylloyd306
@cindylloyd306 2 жыл бұрын
I cannot see what you're typing in the command lines. Also, I can make it fine up to CD Pictures. How do you tell Tesseract the directory of where my image file is? I have an external drive, where I store image pdfs. I would like to OCR those but all I get is nothing. I've no idea what the output file is, nor how to enter it. I would have preferred seeing the full screen command lines and having directories explained. I know the others are thrilled but I'm a goob as far as this kind of stuff goes. 🤣🤣🤣
@AkioEndo197
@AkioEndo197 6 ай бұрын
Can I simply launch it as an executable rather then changing my registry?
@JayMartMedia
@JayMartMedia 6 ай бұрын
You can include the full path the tesseract rather than adding to the PATH. For example: C:\Program Files\... May need to wrap the path to tesseract in quotes if it contains a space.
@AkioEndo197
@AkioEndo197 6 ай бұрын
@@JayMartMedia I want to be able to launch it as an executable without changing my files or anything like that. I'm fine with downloading and uninstalling though.
@JayMartMedia
@JayMartMedia 6 ай бұрын
When running it through the command prompt, it is running the executable. Are you wondering if there is a graphical user interface, rather than using the command line? If so, here is a web based tesseract tool that you could use: tesseract.projectnaptha.com/ Video on web passed project: kzbin.info/www/bejne/qne6YXiufJmEkJYsi=NjpporeeM7q07szi
@AkioEndo197
@AkioEndo197 6 ай бұрын
@@JayMartMedia I want the Japanese one.
@Midnasv
@Midnasv Жыл бұрын
Thanks. To the point.
@Midnasv
@Midnasv Жыл бұрын
Do you know if it is possible to use OCR with password protected PDF?
@krishnaagarwal2056
@krishnaagarwal2056 Жыл бұрын
I can't download the package. It has been detected with virus. Can you suggest any other software to download
@GlobalEconInsights
@GlobalEconInsights 7 ай бұрын
Best tutorial for that
@पापानटोले
@पापानटोले 4 жыл бұрын
Great. Any idea how we can train a special character like checkbox with tick ?
@tazyeenalam
@tazyeenalam 6 ай бұрын
does this work for pdf files that have numerous pages scanned images also?
@MOHAMEDIBRAHIM-yw6pt
@MOHAMEDIBRAHIM-yw6pt 6 ай бұрын
Hi, I am also searching for the feasibility to read a pdf file with multiple pages and detect a signature in that file using OCR, if you have done that can you please help me out?
@suniltiwari4387
@suniltiwari4387 3 жыл бұрын
Can you help us how to install cvat in local server ? In windows server 2019 Please ?
@Levan-u8w
@Levan-u8w 8 күн бұрын
everything goes perfectly until i run command - no such window appears,as for cmd - not even error appears,it just reads that i want tesseract.result and its gone - tab of cmd says that i am using tesseract for a sec - but even that vanishes too
@JayMartMedia
@JayMartMedia 8 күн бұрын
Are you starting CMD, the running the tesseract command, then it closes? Or are you running the tesseract command from a .bat script? Does the CMD window disappear if you run a different command (for example "dir")?
@Levan-u8w
@Levan-u8w 8 күн бұрын
@@JayMartMedia actually i just searched and it looks like tesseract created txt document in same folder that my picture was.only diffirence was that mine didnot pop up like you . - everythings ok but thanks for attention anyways :D
@JayMartMedia
@JayMartMedia 8 күн бұрын
@Levan-u8w Try putting "stdout" after the name of your input file and see if that does it (stdout = standard out). For example: "tesseract myimage.jpg stdout"
@drallisimo34
@drallisimo34 9 ай бұрын
very useful tutorial! 5*
@yeahx32p69
@yeahx32p69 11 ай бұрын
thx a lot. Going to automate my grocery list record misery 😂😂
@123LuisX
@123LuisX 2 жыл бұрын
i added the path to a variable but still not recognize in my windows. do you know what i need to do?=
@JayMartMedia
@JayMartMedia 2 жыл бұрын
You may need to close all your command prompts, and then reopen the command prompt in order to reload those variables
@alexlenhoff6274
@alexlenhoff6274 2 жыл бұрын
@@JayMartMedia Thanks! I was having the same problem and that fixed it
@AtomicTech37
@AtomicTech37 4 жыл бұрын
pretty helpful!
@Black-ie8qz
@Black-ie8qz 3 ай бұрын
Thanks God.
@Farrolet
@Farrolet 7 ай бұрын
thank you sir
@aakashstudyspecials8196
@aakashstudyspecials8196 4 жыл бұрын
ty so much
@bouchelligamohamedhedi2747
@bouchelligamohamedhedi2747 4 жыл бұрын
pretty helpfull
@bilawalmalik-tm6np
@bilawalmalik-tm6np 7 ай бұрын
good video
@IR_Mediaa
@IR_Mediaa Жыл бұрын
Mort OCR try that guys its work well
@MrBorgj
@MrBorgj 4 жыл бұрын
anyone who just wants a gui interface for tesseract should look for gImageReader :D
@sebastianparias1962
@sebastianparias1962 8 ай бұрын
thanks
@Papiii_benz
@Papiii_benz 8 ай бұрын
I'm still having trouble
@JayMartMedia
@JayMartMedia 8 ай бұрын
What are you having trouble with?
@JF-pl2fh
@JF-pl2fh 8 ай бұрын
Best not to install with program files and keep it in users/ because windows messes with program files and was ruining my workflow. Good tutorial still.
@lopezgladwell2014
@lopezgladwell2014 4 жыл бұрын
Didn't work.
@papastalin3498
@papastalin3498 5 ай бұрын
"The system cannot find the path specified" mf it's Pictures
@mohamedkhalith4629
@mohamedkhalith4629 4 жыл бұрын
Nice Video but over speed I will set playback speed to 0.5x who did it?
@cornevanzyl5880
@cornevanzyl5880 2 жыл бұрын
I dont like the implementation. I want something with 2 clicks. Open the app, snip my content and copy the text and use it. simple
@mattg2770
@mattg2770 2 жыл бұрын
Well there are some of those but you have to pay. This is not that.
@MyProfitCodeDotCom
@MyProfitCodeDotCom Жыл бұрын
Thanks for the short, effective guide... are you still going to show how to integrate with Python? (This has been asked about more than once by commentors.)
@BashfulNuke
@BashfulNuke 5 ай бұрын
everything up to this point works correctly tesseract test.png tesseract-results Error; 'tesseract' is not recognized as an internal or external command, operable program or batch file.
@JayMartMedia
@JayMartMedia 5 ай бұрын
Have you added the full path to the folder with the tesseract.exe file to your PATH environment variable? After adding as an environment variable you will need to close and relaunch the command prompt. (Environment variables such as PATH are loaded when the command prompt starts. So when it is already open and you add or change a variable it doesn't get loaded immediately. The updated variables only get loaded when starting.) Another thing you could try is opening the command prompt in the folder where the tesseract.exe is and seeing if the "tesseract" command will work there. Let me know if that helps!
@BashfulNuke
@BashfulNuke 5 ай бұрын
@@JayMartMedia i appreciate the quick response ill need to look back and see what i did but i got it working you earned my sub for this i appreciate you
How to use Tesseract OCR in a Python script (pytesseract)
6:36
JayMartMedia
Рет қаралды 45 М.
NEVER install these programs on your PC... EVER!!!
19:26
JayzTwoCents
Рет қаралды 4,6 МЛН
It works #beatbox #tiktok
00:34
BeatboxJCOP
Рет қаралды 41 МЛН
Une nouvelle voiture pour Noël 🥹
00:28
Nicocapone
Рет қаралды 9 МЛН
Optical Character Recognition (OCR)
6:16
IBM Technology
Рет қаралды 88 М.
How to install tesseract ocr on windows
6:33
Allround Zone
Рет қаралды 100 М.
How to Install the Libraries (OCR in Python Tutorials 01.02)
11:14
Python Tutorials for Digital Humanities
Рет қаралды 57 М.
Tesseract OCR Text Extraction for Windows - Tesseract OCR for Windows Tutorial
5:33
Using Tesseract-OCR to extract text from images
11:29
DFIRScience
Рет қаралды 225 М.
Optical Character Recognition with EasyOCR and Python | OCR PyTorch
16:00
Nicholas Renotte
Рет қаралды 152 М.
Extract text from images with Tesseract OCR on Windows
18:06
DFIRScience
Рет қаралды 106 М.
17 RUN Tools✨ HACKS Every Windows USERS Must Know
13:15
Crown GEEK
Рет қаралды 351 М.
Когда перепутал график девушек😁🐣
0:24
Alexey Merinov
Рет қаралды 3,1 МЛН
Satisfying Vend 😦 Ep.5 #shorts #satisfying #vendingmachine
0:23
TYE Arcade
Рет қаралды 17 МЛН
🪄Вечная спичка #diy #выживание #поход
1:00
Короче, ВИ
Рет қаралды 2,8 МЛН
Карина Кросс #shorts
0:16
Dolly and Friends Shorts Cartoons
Рет қаралды 361 М.
Самые простые строительные леса
0:54
Канал ИДЕЙ
Рет қаралды 1 МЛН
🪄Вечная спичка #diy #выживание #поход
1:00
Короче, ВИ
Рет қаралды 2,8 МЛН