paper-qa
paper-qa copied to clipboard
Minor reader bugfixes and fortified regression testing
This PR improves consistency across readers:
- Tests equation parsing of Docling and PyMuPDF
- Renames PyMuPDF's
image_dpiarg todpi, since (1) this parameter impacts tables too and (2) it will now match other readers' DPI parameter name (e.g. Docling) - Updates PyMuPDF reader's default DPI to unspecified, to align with PyMuPDF's default behavior
- Includes
full_pagein autogenerated index name hash (as full page parsing should not have a collision) - Adds many smaller assertions: image coordinates and shapes,
FileNotFoundError, expected screenshot type