[Bug]: pdfminer crash at start (Adaptive RAG)
Steps to reproduce
Hello,
after 1 month of inactivity I cannot start my RAG any more, without changing anything. The container starts, but the questions sent to the cloud LLM through the pathway container do not consider my local data any more.
The error I notice in the adaptiverag container is: pathway_engine.engine.dataflow ERROR ImportError: cannot import name 'PSSyntaxError' from 'pdfminer.pdfparser' (/usr/local/lib/python3.11/site-packages/pdfminer/pdfparser.py) in operator 12.
It might be the cause.
I have already tried to update the sources from your site but to no avail, I encounter the same error in the log.
Can you please help me? It is urgent, thanks in advance.
Relevant log output
pathway_engine.engine.dataflow ERROR ImportError: cannot import name 'PSSyntaxError' from 'pdfminer.pdfparser' (/usr/local/lib/python3.11/site-packages/pdfminer/pdfparser.py) in operator 12.
What did you expect to happen?
The questions sent to the adaptive RAG should consider the local data.
Version
latest
Docker Versions (if used)
27.4
OS
Linux
On which CPU architecture did you run Pathway?
None
Hi, thank you for reporting the issue. We will look into this and get back to you.
I encountered the same issue. It seems to be a problem with the pdfminer dependency in the latest pathway docker image. I was able to get it work by reverting my image to use the 0.20.1 tag.
@bockisn thank you very much for sharing, it works with the mentioned tag. I changed the beginning of the Dockerfile to:
FROM pathwaycom/pathway:0.20.1
Hi @rjakomin @bockisn , this issue will be resolved in the next deployment, this was caused by the bump of the pdfminer.six dependency in the pdfplumber's latest release.
For now, your suggestion should work.
ok, great, thanks for your notice
Hey, @rjakomin, the new Pathway version has been released. The problem should be fixed in the pathwaycom/pathway:0.21.3 docker image.
Hey @rjakomin,
I am closing this issue, since the problem must have been resolved with the release 0.21.3. Please feel free to reopen this issue or to create a new one if you have any problems with Pathway.