HI! Thanks a lot for the extraction , i want to convert a scanned pdf to editable word doc.In the above video the accuracy is 97% only
@techwithzoum Жыл бұрын
Hi Swetha, You're welcome! Can you please elaborate more your question?
@sarasa971 Жыл бұрын
how to add other language in the code ? Thank you for the great explanation 👏🏼
@dyzy22032 жыл бұрын
Thanks a lot. The code works smoothly. Nice. Can you find, extract a table from a scanned PDF and save it into a dataframe ? Thx
@amanrohada90082 жыл бұрын
Did you find something to extract table from scanned PDF?
@harshvardhanmishra12563 ай бұрын
Did you both have found that? If yes then please help me out with this I am reaching the deadline and have to complete the task.
@kenvinmq Жыл бұрын
Thank you bro, I’ll try that out
@techwithzoum Жыл бұрын
You are welcome!
@zsuzsannakristof2117 Жыл бұрын
Hi, can you modify the code that way, that the new file ext to the text contains the orginal page settings and structur of the orginal pdf. Like the text is in the same place where it was in the orginal pdf
@techwithzoum Жыл бұрын
Hi Zsuzsanna, I am not sure I understand your request. Can you please elaborate for better assistance?
@omuskaikar-gs1cs Жыл бұрын
there is a OCRmyPDF force -ocr library it retains the original format of pdf
@sjohn-7776 ай бұрын
Thank you!
@techwithzoum6 ай бұрын
You're welcome!
@RunRonaldRun Жыл бұрын
Works great, thank you so much.
@techwithzoum Жыл бұрын
You're very welcome, Charl!
@kibtiachowdhury60112 жыл бұрын
Thanks a lot. The code works. I want to get paragraphs and titles without any tables or figures. How can I solve this?
@easylife891 Жыл бұрын
fantastic work
@techwithzoum Жыл бұрын
Thank you!
@davisengelis272 Жыл бұрын
thanks a lot!
@techwithzoum Жыл бұрын
You're very welcome, Davis!
@hrishishetty93223 жыл бұрын
Thank you so much for the help!
@techwithzoum3 жыл бұрын
You're welcome! Do net hesitate to drop ideas of video!
@cherlynang2965 Жыл бұрын
does this work on folder with multiple PDF files?
@techwithzoum Жыл бұрын
Yes, it does Cherlynang
@chepkoechfancy75533 жыл бұрын
Can this code work with pdf in url format? If so, kindly help lines of code to handle such
@ravimakwana5290 Жыл бұрын
Sir can you make a video on that like we have to extract the paragraph under the title from pdf.
@techwithzoum Жыл бұрын
Sure, Ravi! I will explore that!
@sivachaitanya63309 ай бұрын
what version used in this, when i use it gives me poppler path error and tesseract install in pc and path settting error.....
@jeyapauldavid55965 ай бұрын
Unable to get page count. Is poppler installed and in PATH? the errror is comming
@techwithzoum5 ай бұрын
This may be because your system can not access the 'poppler' module. Here is how to set up on a Windows machine: 1. Download the poppler package from this website: poppler.freedesktop.org/ 2. Unzip it in the C:\Program Files (x86) folder 3. Provide the bin folder into a variable you name as follows poppler_path= r"C:\Program Files (x86)\poppler-24.02.0\bin" I hope this helps.
@mohammednisar14583 жыл бұрын
PDFPageCountError: Unable to get page count.I/O Error: Couldn't open file 'C:\Users\Naseer\Desktop\OCR-main\data\First Cry Image.pdf': No error.
@avinashkrishna8695 Жыл бұрын
i'm getting an error, Output exceeds the size limit. Open the full output data in a text editor
@techwithzoum Жыл бұрын
Hi Avinash, Can you tell more about which line the error occurs?
@jardanijonovich1951 Жыл бұрын
Hi, came across ur video after multiple failed attempts of converting my file. Can I somehow ignore the Headers and footers. Also, I have bulletins in my documents and some of the bulletins are on the next page; how do I take care of that? Thanks in advance!!
@shainialakumbura58293 жыл бұрын
PDFInfoNotInstalledError: Unable to get page count. Is poppler installed and in PATH? why am I getting this error
@kiranvanukuri93823 жыл бұрын
U have load many PDFs at a time??
@mallikarjunyadav5912 жыл бұрын
I am getting same error
@emanuelcalderon29126 ай бұрын
brew install poppler for macs, or install popler somehow for windows.
@QorQar Жыл бұрын
هل يمكن مثال على استعمال الكود واين يوضع وكيف اشغله
@avbendre Жыл бұрын
: Failed to activate VS environment: Could not find C:\Program Files (x86)\Microsoft Visual Studio\Installer\vswhere.exe any solution to the above error please telll
@techwithzoum Жыл бұрын
Can you please refer to this discussion on stackoverflow? It might be similar to what you are facing stackoverflow.com/questions/54305638/how-to-find-vswhere-exe-path
@avbendre Жыл бұрын
@@techwithzoum thank you the error resolved when added path in sys variables of poppler and pytesseract and installed pytesseract.exe
@techwithzoum Жыл бұрын
@@avbendre congratulations!
@kiranvanukuri93823 жыл бұрын
Sir super but one question.. Multiple PDFs how to extract text from group or many PDFs???
@KulranjanSingh2 жыл бұрын
Use os.walk() or glob.glob
@vishalgarg84235 ай бұрын
PDFInfoNotInstalledError: Unable to get page count. Is poppler installed and in PATH?
@techwithzoum5 ай бұрын
This may be because your system can not access the 'poppler' module. Here is how to set up on a Windows machine: 1. Download the poppler package from this website: poppler.freedesktop.org/ 2. Unzip it in the C:\Program Files (x86) folder 3. Provide the bin folder into a variable you name as follows poppler_path= r"C:\Program Files (x86)\poppler-24.02.0\bin" I hope this helps.
@TiriAlain3 жыл бұрын
It's usefull, but my pc crash by out of memory or by cpu temperatur highter. ^^