camelot icon indicating copy to clipboard operation
camelot copied to clipboard

Problems with camelot when trying to extract a table with one row

Open Eugenia27 opened this issue 5 years ago • 8 comments

Hi, i'm working with many pdfs that all have tables with the same header but with different numbers of rows, some of them are very extense and abarc many pages and others have the header and only one row. The problem arise when only are the header and one row. For example if the table abarc three pages with the header in all of them, and in the last page there is only one row, that page it doesn't appear as a table in Camelot. I don't know how to solve it.

This one works good: pdf1.pdf

But not the next: pfd2.pdf

Eugenia27 avatar May 05 '20 21:05 Eugenia27

I am seeing the exact same issue

jbendes avatar May 08 '20 06:05 jbendes

Me too. Any idea on how to tweak parameters to make it recognize single-cell table ?

astariul avatar Aug 26 '20 08:08 astariul

I'm also facing the same issue, any luck find a way to work it out?

HayTaub avatar Mar 10 '21 08:03 HayTaub

Am facing the exact same issue, has anyone found a work around ??

vivianah avatar Apr 09 '21 02:04 vivianah

same here.... the table is at the end of the page... tried using "table_areas" but it breaks everything that is already well readed....

SG-aortega avatar Jun 16 '21 23:06 SG-aortega

Also facing the same problem. Not only 2 row table, I sometimes need to read 1 row table.

HayTaub avatar Jun 19 '21 08:06 HayTaub

I had the same problem and changed the line_scale parameter (increased from default 15 to 30), as it then detects smaller lines. My guess is that the vertical lines of a single-row table are too short otherwise.

Tomp0801 avatar Oct 08 '21 13:10 Tomp0801

I had the same problem and changed the line_scale parameter (increased from default 15 to 30), as it then detects smaller lines. My guess is that the vertical lines of a single-row table are too short otherwise.

Hi, This seems to work for me. Thanks!

pawankurada avatar Dec 14 '21 07:12 pawankurada