pyspur icon indicating copy to clipboard operation
pyspur copied to clipboard

Q: how to add dependencies for Python Function Call?

Open felixgao opened this issue 10 months ago • 1 comments

I have some PDF documents that I want to do local parsing before feeding into the Single LLM calls. For example. I want to use Docling to extract the raw text then use LLMs to do more structured parsing and classification.

How do I that?

felixgao avatar Feb 22 '25 19:02 felixgao

Hi Felix,

Thanks for your interest in using Docling with our tool. We did experiment with integrating Docling into PySpur for a time, but we ultimately had to roll it back. The main issue was that including Docling increased the size of the pyspur-backend Docker image significantly (over 10GB) because of the OCR model weights, and that wasn’t ideal for many users who don’t need that functionality.

That said, you can still use Docling in your workflow. I recommend installing Docling in the same environment where you have PySpur installed (note that this approach isn’t supported in container mode). Once set up, you can add a Python code node to parse your PDFs with Docling and then feed the output into your Single LLM calls for further structured parsing and classification.

I hope this helps, and please let us know if you have any further questions or run into any issues!

srijanpatel avatar Feb 23 '25 03:02 srijanpatel