ragflow icon indicating copy to clipboard operation
ragflow copied to clipboard

[Feature Request]: Integration of Marker with RAG for Complex PDF parsing

Open vishaldwdi opened this issue 1 year ago • 1 comments

Is there an existing issue for the same feature request?

  • [X] I have checked the existing issues.

Is your feature request related to a problem?

No response

Describe the feature you'd like

I'm proposing integration of Marker repo within RAG pipeline for the vectorization and parsing of the most complex PDF whether includes images or tabels or anything else.

https://github.com/VikParuchuri/marker

Describe implementation you've considered

No response

Documentation, adoption, use case

No response

Additional information

No response

vishaldwdi avatar Aug 13 '24 19:08 vishaldwdi

I support this. marker is the best open source OCR library.

npnpatidar avatar Sep 06 '24 18:09 npnpatidar

support this!

huangcaiyun avatar Sep 18 '24 12:09 huangcaiyun

@vishaldwdi Thanks so much for the suggestion — and apologies for the delayed response! ⏳

Currently, our system doesn’t support OCR based on the Market specifically. We'd love to better understand your use case — could you share more details on how the recognition results differ between Deepdoc and the Market ? 🧐 This would really help us evaluate the value and feasibility of adding support.

We truly appreciate your feedback and interest! Feel free to keep the great ideas coming 💡🙌

which-W avatar May 16 '25 09:05 which-W