Thank you so much . I do the same as you but I always get an empty Excel. Why would it be?
@hemu272310 ай бұрын
Hey, have you got the mistake?
@ram_rahim_creations_officials Жыл бұрын
Hi @karndeep Thank you for sharing. Will it work if my table doesn't have vertical and horizontal lines?
@ShreyasG-d2n4 ай бұрын
it should
@nomuchohan Жыл бұрын
Dude, please explain how to use the PPstructure from paddle paddle into our own custom code
@ajithn733610 ай бұрын
I tried and i always get an empty excel only.
@xy461124 күн бұрын
same
@niroshiniedayaratne40662 жыл бұрын
My output is always empty xlsx file. What could be the reason? Thanks in advance!
@karndeepsingh2 жыл бұрын
May be OCR is unable to read the table content
@kishoripawar2522 Жыл бұрын
@@karndeepsingh Is there any prerequisite for input image? Like resolution more than X or something like that? Because for me as well, output is empty.
@kishoripawar2522 Жыл бұрын
@@karndeepsingh Eve with high resolution image output is empty, when I checked show.html, the blue box is not able to correctly locate the table in image. So I think as there is no text inside blue box, there is empty csv. Please correct me if I am wrong.
@pavitrabiradar6334 Жыл бұрын
@@kishoripawar2522 even iam getting empty xlsx as output did you found any solution?
@보라색사과-l1r Жыл бұрын
any update for this issue? I am facing this issue after trying another ocr model... please help
@avikalchauhan9907 Жыл бұрын
when I am running the code predict_table.py file is not there
@kiddicode68972 жыл бұрын
How can I apply Google Vision after table is recognized?
@venkatesanr94552 жыл бұрын
Thanks for the great explanation and video. I have some doubts like 1. Is paddleocr is open source library and anyone can use? 2.Whether we can fine-tune ocr models like easyocr, paddleocr libraries, Kindly reply and share links that will be useful for reading/learning purpose. 3. Whether huggingface library has ocr models?
@karndeepsingh2 жыл бұрын
1. Yes, paddlepaddle is an open source library. 2. You can train OCR model using paddleocr 3. Huggingface may not have OCR models.
@venkatesanr94552 жыл бұрын
@@karndeepsingh Thanks for your kind replies.Can you share any links for finetuning models of easyocr/paddleocr( I hav searched for easyocr but not obtained proper links for finetuning tasks)
@karndeepsingh2 жыл бұрын
@@venkatesanr9455 you can check paddleocr github for the same.
@venkatesanr94552 жыл бұрын
@@karndeepsingh Ok thanks a lot
@NickWindham2 жыл бұрын
@@venkatesanr9455 Watch his video titled OCR Text from PDFs and Image Documents using docTR | Better than Tesseract OCR | Text Extraction
@ganeshrajv1302 жыл бұрын
wont this support long image table
@eliaweiss111 ай бұрын
Thanks, all I get is empty cells
@jayeshnikam3279 Жыл бұрын
This is kind of urgent. What if on some page half of the table is in one page and other half is on 2nd page. What can be done on such situation? Will the model recognize it??. i highly expect your answer as I am currently working on it. Thank you! :)
@karndeepsingh Жыл бұрын
In such situations, you need to search identifier in the page that consider that half of the information in going to next page. Model can only help you extract or detect table but on top of that you need to apply logic to know whether its full information or half information
@poojabhandari6312 жыл бұрын
getting this error error: legacy-install-failure × Encountered error while trying to install package. ╰─> PyMuPDF what to do??
@Smddlvvs2 жыл бұрын
How to make this code work on pdf files with multiple pages
@karndeepsingh2 жыл бұрын
Pass each page of PDF to the model
@Smddlvvs2 жыл бұрын
@@karndeepsingh i have tried but i am unable to iterate
@texasfossilguy2 жыл бұрын
you need to write code to iterate each page of it. Ask chatgpt or google that, ive seen it. If I find it Ill let you know.
@Smddlvvs2 жыл бұрын
@@texasfossilguy yaaaa pls let me know if you find one
@AliAlias2 жыл бұрын
Use other python libraries to extract pdf to images then ocr it one by one using loop 😊
@louieelumbaring17902 жыл бұрын
how did you get the vqa folder? Sorry I was trying to do all the steps you did and find error on the last line, i have no idea to fix it. Thanks in advance! [Errno 2] No such file or directory: 'PaddleOCR/ppstructure' /content/PaddleOCR/ppstructure/inference Traceback (most recent call last): File "/content/PaddleOCR/ppstructure/table/predict_table.py", line 230, in main(args) File "/content/PaddleOCR/ppstructure/table/predict_table.py", line 149, in main image_file_list = get_image_file_list(args.image_dir) File "/content/PaddleOCR/ppocr/utils/utility.py", line 60, in get_image_file_list raise Exception("not found any img file in {}".format(img_file)) Exception: not found any img file in /content/PaddleOCR/ppstructure/table/image1.png
@rivamalik9575 Жыл бұрын
Provide absolute path to the image that is placed in drive. For example /content/gdrive/MyDrive/PaddleOCR/ppstructure/table/image1.png and also ensure that the image is place in the table folder that you have mentioned in the exception statement.
@pavitrabiradar6334 Жыл бұрын
Hello Iam always getting output as empty xlsx file could you please help me here.
@karndeepsingh Жыл бұрын
May be OCR is not working that great. You may consider replacing OCR.
@ShivShankarDutta12 жыл бұрын
getting this error executing #%cd PaddleOCR/ppstructure !python3 /content/PaddleOCR/ppstructure/table/predict_table.py --det_model_dir=inference/en_PP-OCRv3_det_infer --rec_model_dir=inference/en_ppocr_mobile_v2.0_table_rec_infer --table_model_dir=inference/en_ppocr_mobile_v2.0_table_structure_infer --image_dir=/content/PaddleOCR/ppstructure/table_2.png --rec_char_dict_path=../ppocr/utils/dict/table_dict.txt --table_char_dict_path=../ppocr/utils/dict/table_structure_dict.txt --det_limit_side_len=736 --det_limit_type=min --output ./output/table Traceback (most recent call last): File "/content/PaddleOCR/ppstructure/table/predict_table.py", line 30, in import tools.infer.predict_det as predict_det File "/content/PaddleOCR/tools/infer/predict_det.py", line 31, in from ppocr.data import create_operators, transform File "/content/PaddleOCR/ppocr/data/__init__.py", line 35, in from ppocr.data.imaug import transform, create_operators File "/content/PaddleOCR/ppocr/data/imaug/__init__.py", line 47, in from .ct_process import * File "/content/PaddleOCR/ppocr/data/imaug/ct_process.py", line 22, in import Polygon as plg ModuleNotFoundError: No module named 'Polygon'
@rohithuria11682 жыл бұрын
how to fix this error ?
@goswamidivyang20102 жыл бұрын
@@rohithuria1168 Did you get any fix for that? I am also facing the same error
@luisvite37662 жыл бұрын
Me too
@xy461124 күн бұрын
maybe !pip install polygon
@shobhitsadwal608111 ай бұрын
it is not working for me .
@rajeshroyal59222 жыл бұрын
i have tried with vs code and colab but iam getting this error python3: can't open file '/PaddleOCR/ppstructure/table/predict_table.py': [Errno 2] No such file or directory
@thepresistence59352 жыл бұрын
change the path bro
@rajeshroyal59222 жыл бұрын
@@thepresistence5935 I tried with change of path also getting same error
@thepresistence59352 жыл бұрын
@@rajeshroyal5922 It's working fine for me, put quotes.
@vogel24992 жыл бұрын
I suspect text ocr is independent from table detection/recognition. You could replaced it with easyocr/pytesseract without ruining the structure.
@shwetabhilare9473 Жыл бұрын
[Errno 2] No such file or directory: 'PaddleOCR/ppstructure' /content/PaddleOCR/ppstructure/inference Traceback (most recent call last): File "/content/PaddleOCR/ppstructure/table/predict_table.py", line 30, in import tools.infer.predict_det as predict_det File "/content/PaddleOCR/tools/infer/predict_det.py", line 31, in from ppocr.data import create_operators, transform File "/content/PaddleOCR/ppocr/data/__init__.py", line 35, in from ppocr.data.imaug import transform, create_operators File "/content/PaddleOCR/ppocr/data/imaug/__init__.py", line 47, in from .ct_process import * File "/content/PaddleOCR/ppocr/data/imaug/ct_process.py", line 22, in import Polygon as plg ModuleNotFoundError: No module named 'Polygon' getting this error please help.
@madhavkumarpancholi9842 Жыл бұрын
get to the point dude.
@anouaraadoud58 Жыл бұрын
Errno 2] No such file or directory: 'PaddleOCR/ppstructure' /content/PaddleOCR/ppstructure/inference Traceback (most recent call last): File "/content/PaddleOCR/ppstructure/table/predict_table.py", line 230, in main(args) File "/content/PaddleOCR/ppstructure/table/predict_table.py", line 153, in main table_sys = TableSystem(args) File "/content/PaddleOCR/ppstructure/table/predict_table.py", line 67, in __init__ self.text_detector = predict_det.TextDetector(copy.deepcopy( File "/content/PaddleOCR/tools/infer/predict_det.py", line 141, in __init__ self.predictor, self.input_tensor, self.output_tensors, self.config = utility.create_predictor( File "/content/PaddleOCR/tools/infer/utility.py", line 199, in create_predictor raise ValueError( ValueError: not find model.pdmodel or inference.pdmodel in inference/en_PP-OCRv3_det_infer
@SaniyaFarash Жыл бұрын
I am getting the same error. please tell how to solve this
@rajeshroyal59222 жыл бұрын
i can't open predict_table.py file getting the same error python3: can't open file '/PaddleOCR/ppstructure/table/predict_table.py': [Errno 2] No such file or directory how can i resolve
@kiddicode68972 жыл бұрын
%cd /content/PaddleOCR: go to the Path !mkdir inference: create folder "inference" inside the Path below "/content/PaddleOCR" %cd /content/PaddleOCR/inference: go to the PATH download and unzip file inside "inference"