camelot
camelot copied to clipboard
Same table extracted twice from PDF in stream mode
Describe the bug
Camelot extracts the same table twice under some circumstances. This happened in stream mode; camelot extracts the table only partially on the first try.
Steps to reproduce the bug
- Install
camelot-py[base]with pip - Download PDF file below
- Run script below
Expected behavior
I expected camelot to either extract exactly one table or multiple tables which do not overlap.
Code
#!/usr/bin/env python3
import camelot
tables = camelot.read_pdf("./Lijnfolder-dr-2024-regio-Arnhem.pdf", "8", flavor="stream")
for table in tables:
camelot.plot.contour(table)
https://www.connexxion.nl/getmedia/c2bce2c6-ebfe-43a9-8154-0b6bec9244fd/Lijnfolder-dr-2024-regio-Arnhem.pdf
Screenshots
Environment
- OS: Windows 11
- Python version: 3.12.8
- Numpy version: 2.0.2
- OpenCV version: 4.11.0
- Ghostscript version:
- camelot version: 1.0.0
Additional context