open-parse Does the original image information in the PDF need to be parsed?

Does the original image information in the PDF need to be parsed?

Open ic-xu opened this issue 1 year ago • 1 comments

Description

PDF is a document with mixed graphics and text. When we are doing RAG, the pictures in the PDF often contain important information, so we generally need to return the parsed pictures to the user as they are; currently I have a private version that makes pictures Extraction, I am not sure whether the main branch needs this part of the function

Apr 28 '24 01:04 ic-xu

Open a PR and we can take a look!

Apr 28 '24 15:04 Filimoa

open-parse open-parse copied to clipboard

Does the original image information in the PDF need to be parsed?

Description

open-parse
open-parse copied to clipboard