tabula-java
tabula-java copied to clipboard
Extraction not recognising table column
Attached PDF file is not processed correctly:
- in Stream mode, Tabula does not recognize the last column (col 4)
- cell data from col 4 is extracted but merged into text from col 3
- error occurs when using auto-detection, and manual selection
- page is an extract from a large document with many pages with the same table format that are processed correctly
Please advise if it is possible to force tabula to detect 4 columns?
try this:
SpreadsheetExtractionAlgorithm extractor = new SpreadsheetExtractionAlgorithm(); List<Table> table = extractor.extract(page);