tabula-java icon indicating copy to clipboard operation
tabula-java copied to clipboard

Extraction not recognising table column

Open martinswanson opened this issue 3 years ago • 1 comments

Attached PDF file is not processed correctly:

  • in Stream mode, Tabula does not recognize the last column (col 4)
  • cell data from col 4 is extracted but merged into text from col 3
  • error occurs when using auto-detection, and manual selection
  • page is an extract from a large document with many pages with the same table format that are processed correctly

Please advise if it is possible to force tabula to detect 4 columns?

comments.pdf

martinswanson avatar Feb 28 '22 13:02 martinswanson

try this: SpreadsheetExtractionAlgorithm extractor = new SpreadsheetExtractionAlgorithm(); List<Table> table = extractor.extract(page);

oswardlx avatar Sep 01 '22 06:09 oswardlx