marker icon indicating copy to clipboard operation
marker copied to clipboard

WIP feat: accept binary PDF instead of just path to PDF

Open aguadoenzo opened this issue 1 year ago • 3 comments

While the library expects a filepath to be given to the PdfConverter, the underlying pdfium.PdfDocument constructor supports bytes and other types. Adding 'bytes' as an accepted type for the Document Pydantic model allows binary PDFs to be parsed by the library.

I've tested this patch manually and things seem to work fine so far, but I'd prefer to add tests for it. However the test suite seems to be heavily built around creating a temporary file, and I'm not sure where to start to test this

aguadoenzo avatar Dec 05 '24 13:12 aguadoenzo

CLA Assistant Lite bot All contributors have signed the CLA ✍️ ✅

github-actions[bot] avatar Dec 05 '24 13:12 github-actions[bot]

I have read the CLA Document and I hereby sign the CLA

aguadoenzo avatar Dec 05 '24 13:12 aguadoenzo

Shouldn't this be made by creating a new provider e.g. binary_data provider, cv2_image provider ?

AGenchev avatar Feb 27 '25 15:02 AGenchev

Up, this feature could be very useful !

mauryaland avatar May 22 '25 13:05 mauryaland