Remove Hard Dependency of unstructured.io API key
They apparently no longer provide free access to API keys?!
@Brandonagil Would something like LlamaParse work for your usecase. Its not open source though.
@MODSetter There is Aryn DocParse, which uses the open source Sycamore project.
@Brandonagil LMK if this https://github.com/MODSetter/SurfSense/pull/123 solves this issue for now?
By the way - wouldn't it be easier to create your own function for parsing files instead of using the Unstructured/llama API? Unstructured is an open-source solution, so it would be enough to add a Python file that would handle processing uploaded files, and RAG for documents would work offline and for free without any limits
By the way - wouldn't it be easier to create your own function for parsing files instead of using the Unstructured/llama API? Unstructured is an open-source solution, so it would be enough to add a Python file that would handle processing uploaded files, and RAG for documents would work offline and for free without any limits
Running the offline version of Unstructured is not an easy task. Moreover, most SurfSense users are not technically proficient enough to write their own code.
However, you can still run it with the offline version of Unstructured by changing the base_url and not providing an API key.
Docling support is planned to extend local ETL service support: https://github.com/MODSetter/SurfSense/issues/161