paper-qa icon indicating copy to clipboard operation
paper-qa copied to clipboard

Minor reader bugfixes and fortified regression testing

Open jamesbraza opened this issue 3 weeks ago • 1 comments

This PR improves consistency across readers:

  • Tests equation parsing of Docling and PyMuPDF
  • Renames PyMuPDF's image_dpi arg to dpi, since (1) this parameter impacts tables too and (2) it will now match other readers' DPI parameter name (e.g. Docling)
  • Updates PyMuPDF reader's default DPI to unspecified, to align with PyMuPDF's default behavior
  • Includes full_page in autogenerated index name hash (as full page parsing should not have a collision)
  • Adds many smaller assertions: image coordinates and shapes, FileNotFoundError, expected screenshot type

jamesbraza avatar Nov 14 '25 23:11 jamesbraza

Related Documentation

Checked 1 published document(s) in 1 knowledge base(s). No updates required.

How did I do? Any feedback?  Join Discord

dosubot[bot] avatar Nov 14 '25 23:11 dosubot[bot]