paper-qa
paper-qa copied to clipboard
Supporting per-page `ImpossibleParsingError` as opposed to per-Document
As of paper-qa==5.2.0, within parse_pdf_to_pages we discard an entire document if any of its pages encounter an ImpossibleParsingError.
Most of the time we hit an ImpossibleParsingError, it's due to failing on just one small part of the document.
Ideally we could keep the rest of the document, and just discard the specific failed-to-parse page(s).