Extraction not recognising table column

Open martinswanson opened this issue 3 years ago • 1 comments

Attached PDF file is not processed correctly:

in Stream mode, Tabula does not recognize the last column (col 4)
cell data from col 4 is extracted but merged into text from col 3
error occurs when using auto-detection, and manual selection
page is an extract from a large document with many pages with the same table format that are processed correctly

Please advise if it is possible to force tabula to detect 4 columns?

Feb 28 '22 13:02 martinswanson

try this: SpreadsheetExtractionAlgorithm extractor = new SpreadsheetExtractionAlgorithm(); List<Table> table = extractor.extract(page);

Sep 01 '22 06:09 oswardlx