pdftochat icon indicating copy to clipboard operation
pdftochat copied to clipboard

It doesn't work very well with this PDF.

Open datermine opened this issue 2 years ago • 4 comments

Here's an example of a PDF it doesn't work well at all with: https://nysirestakes.com/backend/News/news_upload/2023_Breeders_Award_12123_1706.pdf

Sample prompt: What are the headers of the table?

datermine avatar Jan 23 '24 22:01 datermine

I appreciate you reporting this! Yeah I don't think it does too well with tables to be honest since I pass it all in as just text. Perhaps a feature to implement, which is detecting tables and embedding them in a certain format

Nutlope avatar Jan 24 '24 00:01 Nutlope

yeah its a feature to implement , detecting tables should be nice to have .

ajaxbo360 avatar Jan 24 '24 14:01 ajaxbo360

I also tried this with a research paper and it didn't work well. The pdf had tables, charts and texts. The model seemed to be hallucinating.

rudro12356 avatar Feb 10 '24 22:02 rudro12356

How about using a custom document loader like Unstructred? Unstructred is also available on Langchain.

hynra avatar Feb 28 '24 03:02 hynra