camelot icon indicating copy to clipboard operation
camelot copied to clipboard

Camelot switches characters around when there is double line breaks

Open gnadlr opened this issue 2 years ago • 0 comments

Describe the bug When cells have an empty line separating portions of the text (double line breaks), the first character after a double line breaks moves to the beginning of the text, and I only get a single line break instead. The dropped line break is not important if the texts could be fixed Original text in cell:

ABC
DEF

GHI

Actual behavior of camelot.read_pdf(..., flavor='lattice')

GABC
DEF
HI

[Sample pdf - Row 6, 2nd column 1b.pdf

This seems to be an old bug. This stackoverflow question describes the exact issue. This is also raised in the old camelot repo without fixes.

Hope you guys can look into it.

gnadlr avatar Jun 29 '23 11:06 gnadlr