hbscorez
hbscorez copied to clipboard
Evaluate PDF parsing libraries
currently using https://github.com/chezou/tabula-py/tree/master problems:
- needs Java
- doesn't handle some corner cases well
Alternatives:
- https://github.com/camelot-dev/camelot
- https://github.com/pymupdf/PyMuPDF-Utilities