Search icon indicating copy to clipboard operation
Search copied to clipboard

Alternatives to GROBID (PDF parsing)

Open jankrepl opened this issue 2 years ago • 4 comments

Are there any alternatives to GROBID and would there be any major advantages in using them?

Alternatives (feel free to add new entries)

  • https://github.com/pdfminer/pdfminer.six
  • https://github.com/mstamy2/PyPDF2
  • https://github.com/pymupdf/PyMuPDF

Other links

Comments

If we go for a pure Python solution there might not be need for intermediary formats (i.e. TEI XML for GROBID)

jankrepl avatar Oct 20 '21 12:10 jankrepl