One issue with the Chinese text is that the first two characters are in traditional Chinese and the second two characters are the simplified. So the first time you ran it and it gave you 汉语,汉语 was correct as it converted the traditional into simplified because you used chi_sim.
@elieferland85613 жыл бұрын
Thank you for all your good tutorials! Could you make a video on natural scene text detection using opencv and EAST one day??
Hi...I ran your code again and it is working. So you may ignore the above comment. The problem is my file has a dark background and so with the values that you have shown for adjusted threshold do not seem to be converting the background to white and the print(text) is still not showing the text embedded within the picture. My picture file is PNG - will that require a different adjustment to your code and the threshold values? Thanks - really appreciate
@ForMyOwn_111 ай бұрын
It was very informative and helpful lesson. Thanks
@marwen25944 жыл бұрын
thanks for the tutorial i have question please , can tesseract ocr detect handwriting if so can you make another tutorial about that
@Benm27933 жыл бұрын
Adaptive threshold is a great tip. Thanks!
@wimr.96722 жыл бұрын
thanks for the tut! Helped me a lot
@ahmedhelal9203 жыл бұрын
Very good introduction to ocr . thanks 😊
@RapidView4 жыл бұрын
Tnx a lot. What would be the processing when u get dynamic images.
@hemantchauhan64376 ай бұрын
NEED HELP! I am making a website where user can upload a pdf but I want that pdf to upload only if that pdf has images of only HANDWRITTEN text. Thank you for reading.
@davidralte45723 жыл бұрын
Thank you for Your help, May God Bles You.
@sauravsinha8746 Жыл бұрын
Will you tell how we can get this data in csv format
@heinhane4 жыл бұрын
This is very helpful . Tanks
@bibhutirajansingh3 жыл бұрын
Can a multi-page PDF be OCRed this way?
@ok-kp5jn4 жыл бұрын
Great content!!
@jay48666 ай бұрын
Hi can you do the same thing using reberry pi or Arduino.
@len52042 жыл бұрын
Hi, thanks for the tutorial. But i was wondering how it will be if there's gonna be an upload photo feature. So we dont have to change the image filename to be used everytime. Is it possible?
@maloukemallouke97354 жыл бұрын
Thank you for sharing, I am wondering if OCR it's a heavy process to find digits on large images?
@AI_CANISTER3 жыл бұрын
its simple, to get only digits in a large image,, config = r'--oem 3 --psm 6 outputbase digits' digitd = pytesseract.image_to_data(img, config) when you print digits you get only digits
@maloukemallouke97353 жыл бұрын
@@AI_CANISTER thanks but i already tested not work
@tushargawade20454 жыл бұрын
really Great !!! and helpful. if we train our model using cnn , will it increase accuracy?
@sauravsinha8746 Жыл бұрын
How can we save the output in csv format please
@dimitheodoro3 жыл бұрын
Thanx alot!!!
@firegames27413 жыл бұрын
Thank for such a useful video. I need help from you, can you convert captcha file to text. I'm trying, but not converting properly.
@andreadotta66534 жыл бұрын
Hi Sergio, Thanks for the video! I've a question. Can you have something ideas to convert raster images in vector images (es. jpg to svg)?
@eksimedya46594 жыл бұрын
Danke schön mein bruder
@marienoellevandervlugt91834 жыл бұрын
I love tour tuto, i'am trench, englich it's difficile. But i want learn python
@r-beanmondy62033 жыл бұрын
if I wanna use it on live video which part that I should change for my code?
@gawaderajesh4 жыл бұрын
I am gettng below error... Please help raise TesseractError(proc.returncode, get_errors(error_string)) pytesseract.pytesseract.TesseractError: (1, 'Error opening data file C:\\Program Files\\Tesseract-OCR/tessdata/chi_sim.traineddata Please make sure the TESSDATA_PREFIX environment variable is set to your "tessdata" directory. Failed loading language \'chi_sim\' Tesseract couldn\'t load any languages! Could not initialize tesseract.')
@gowthamns82284 жыл бұрын
Wow very good, but the problem is "If the text is very clear and crisp its is showing output correctly", But I want to know if the image has multiple data not only text for example "bills, taken calendar photo or any kind of images", How to print the string from that, I tried my self it's not printing anything, Any idea for this?
@AI_CANISTER3 жыл бұрын
it can detect digits since date is mostly in digits i think it will work well. config = r'--oem 3 --psm 6 outputbase digits' digitd = pytesseract.image_to_data(img, config) when you print digits you get only digits
@vinsmokearifka3 жыл бұрын
thank you. how to get only title?
@sauravsinha8746 Жыл бұрын
How can we get this data into csv format
@riyajagtap5006 Жыл бұрын
CAN IT WORK FOR 7 SEGMENT LED
@opendllmaster51252 жыл бұрын
Isn't it possible to train Tesseract to improve the reading?
@abdelrhmanshokr75463 жыл бұрын
this helped a lot but still there is a date value that it doesn't seem to get it I don't know why to be honest
@user-zo2gm1bh1f3 жыл бұрын
Please sir lang arbic text = pytesseract.image_to_string(adaptive_threshold, config=config, lang "arbic" ) ARB OR AR OR What ???
@sauravsinha8746 Жыл бұрын
Can we save to Csv format
@dukeofminecraft3 жыл бұрын
can you train pytesseract on handwriting data and return the string data ?
@ashishzarekar95994 жыл бұрын
could you please help on how to implement for scanned and digital pdf?
@lakshmitejaswi78324 жыл бұрын
Make a video on how to build custom ocr
@nikolaydd62194 жыл бұрын
Thanks
@LuisGarcia-tb9po3 жыл бұрын
Tesseract has been adding arrows to each each cell in my excel spreadsheets, anyone know why that might be? It recognizes every word and number correctly but adds some kind of ‘illegal” character code that is excel displays as arrows then boxes with a question mark inside
@thesupremeeagle21464 жыл бұрын
Do you have another way to use tesseract. I want to turn my programme into executable for people who want to download it, but to make it work they need this tesseract-ocr file. So i put it into the download file but the tesseract file is too heavy ! I dont want to make my programme at 1GO because of tesseract ! Please help :c
@AKKJ4202 жыл бұрын
Do an ANPR mate
@juanpajaro40844 жыл бұрын
Gracias amigo, me resolviste muchas dudas.
@senpaikun59473 жыл бұрын
hey... im not able to print that chinese letters in my output... can asome one help me oue plz
@luismata40864 жыл бұрын
How can I make than the algoritm recognize Latex language? ---> pytesseract.image_to_string(img, lang = '?') . What have to use for the parameter "lang"?
@NicolaMastrandrea4 жыл бұрын
Grazie 😊
@revudevendraswamy66324 жыл бұрын
Is it work for handwritten data ??
@kascesar4 жыл бұрын
hello, im looking for a rcnn for this task, do u know a nice one for this task ?
@lukmanchaiyarab14514 жыл бұрын
can you please show digit recognition ;thank in advance
@AI_CANISTER3 жыл бұрын
config = r'--oem 3 --psm 6 outputbase digits' digitd = pytesseract.image_to_data(img, config) when you print digits you get only digits
@Terminator-lx5jx3 жыл бұрын
You didnt solve the text though after pre processing
@rajeevkalaskar63733 жыл бұрын
I want to make a desktop application for this. how can I do it. Need help 🆘
@pysource-com3 жыл бұрын
For commercial projects/consulting services you can contact me here: pysource.com/services
@towhidurrahman82024 жыл бұрын
is this possible for number plate recognition form this code ? and the language in bengal
@AI_CANISTER3 жыл бұрын
It can't recognize number plate but it can extract the digits and alphabet after detecting the number plate with a different method
@ankursoni80603 жыл бұрын
How to detect a text from a particular co-ordinate of an image?
@pysource-com3 жыл бұрын
First you need to cut that region. Check my youtube videos "Crop images" and you'll know hot to d that. once you did cut the portion, you can parse that one to the OCR
@kisamesafe3 жыл бұрын
it could be shorter
@sarthakgarg65314 жыл бұрын
how we can read different font like it the image has italic font so how can we do that ?
@AI_CANISTER3 жыл бұрын
I've tried the same method and it worked, I'm sure it will work for you too
@jeffu733 жыл бұрын
For me, the image is not getting recognized.
@raquelcosta27303 жыл бұрын
same for me :(, i have tried different clear images and it's not working. Any tips?
@AneleMbabela4 жыл бұрын
Instructions and source code link is broken.
@pysource-com4 жыл бұрын
I've just fixed it, thanks for pointing that out
@AneleMbabela4 жыл бұрын
@@pysource-com Thanks for the work you've been putting out. Its really making a difference. God bless you, brother..
@NoamHarel-Google-Is-The-Best4 жыл бұрын
need some help.. it's wirte this line... You need configured Python 2 SDK to render Epydoc docstrings thanks a lot
@nuwanthajayasinghe1153 жыл бұрын
I installed tesseract and try to work in vscode.But when programming python from vscode, tesseract could not be imported into vscode.Can you tell me how to import tesseract to the vscode??
@jeffu733 жыл бұрын
what is he using vscode or?
@rohitnara67384 жыл бұрын
cv2.imread("img.gif") is not working how can we read text from .gif file type please tell
@jasonlo34293 жыл бұрын
The Chinese words are traditional and not simplified. Change the language to chi_tra and it should work better