xavctn comments

Results 11 comments of


                                            xavctn

pdf to excel - columns got merged

Hello, Thanks for the feedback. I am aware of this issue whenever columns are not greatly separated. I will try to work on it in upcoming updates.

Lighter version of this package

Hello, As of now, I do not plan to create a lighter version. Considering what is being used in the processing of the algorithm, only beautifulsoup and xlsxwriter would be...

width and height of each grid

Hello, Right now, the cells height/width are supposed to be autofitted to their content but it might be possible to do it. I will check if I can do it...

VisionOCR class missing in module after installing [gcp]

Hello, That should not be happening. What version of the library have you installed ?

Cannot extract from these images

Hello, I made some modifications to the algorithm that are going to be included in the next release. I am not sure if it is going to handle those tables...

Getting FileNotFound error

Hello, Can you provide the numba version installed in your environment ? I will try to replicate the issue.

Getting FileNotFound error

Hi, I tried to replicate the issue on Windows using Python 3.11 but I was not able to. I do not know what is happening on your end TBH. I...

Exotic sheet format impact over table/cell recognition?

Hello, This is not really supposed to happen. Can you apply the extraction **without** any OCR and check the number of columns in your table (using the `extract_tables` method) ?...

Missing column header content

Hello, I took a look at it and this is due to the poor quality of the table header that messes up the table detection. As of now, I won't...

PDF table.box is inaccurate?

Hello, As mentionned in the documentation, when processing PDFs, all pages are converted to images using a DPI of 200. The table coordinates returned by the library correspond to this...