haystack-core-integrations
haystack-core-integrations copied to clipboard
Unstructured File Converter: maintenance and refactoring
This component was developed a while ago when the Unstructured ecosystem was smaller and simpler. It evolved over time and now includes: the open-source library, free and paid APIs, API clients, the Docker image (for running the API locally).
TODO
- ~Ensure compatibility with
unstructured-client>=0.30.0(see #1416).~ This was magically fixed in #1841 - Evaluate whether we can remove the dependency on the
unstructuredlibrary. Initially, this was the only way to programmatically query self-hosted APIs, but we should explore if the client alone is sufficient. - Verify that our integration correctly works with APIs hosted by Unstructured (review URLs, etc...) or fix any issues.
One thing to have in mind also: renaming the paths argument to sources with the same type as other converters to be in line with what other converters expect :)