paper-qa icon indicating copy to clipboard operation
paper-qa copied to clipboard

Supporting per-page `ImpossibleParsingError` as opposed to per-Document

Open jamesbraza opened this issue 1 year ago • 0 comments

As of paper-qa==5.2.0, within parse_pdf_to_pages we discard an entire document if any of its pages encounter an ImpossibleParsingError.

Most of the time we hit an ImpossibleParsingError, it's due to failing on just one small part of the document.

Ideally we could keep the rest of the document, and just discard the specific failed-to-parse page(s).

jamesbraza avatar Oct 15 '24 18:10 jamesbraza