Sebastian Husch Lee
Sebastian Husch Lee
> When this happens, it would be helpful to route the PDF to another converter, for example. This could also apply to empty PDF content, which might be scanned documents...
> [@sjrl](https://github.com/sjrl) I'll give `DocumentLengthRouter` a try. Just curious, what's the reason for not enabling a second list of empty/failed files in the converters? Also, I will share these files...
@anakin87 as a follow up for issues that haven't been closed yet we should probably also try and adopt using the new `reasoning` field of `StreamingChunk` as well. Opened issue...
hey @dragonTalon could you provide a code snippet that caused the issue?
@agnieszka-m I'm also wondering if there is a way we could make the yaml part a more complete example. For example, this requires the core integration [`mcp-haystack`](https://pypi.org/project/mcp-haystack/) to be installed...
@agnieszka-m feel free to update this PR with the changes we discussed offline!
Additionally we should: - Consider updating these files which have code specific to handling types in python 3.9. I say consider since if we remove support for deserializing and comparing...
@OGuggenbuehl definitely looks like an interesting approach! I've left an initial set of comments, but to further review I'd appreciate if you could add a set of tests like the...
Thanks for your continued work on this @OGuggenbuehl! Some general comments. Could you: - Add a release note for this PR following the instructions [here](https://github.com/deepset-ai/haystack/blob/main/CONTRIBUTING.md#release-notes) - Could you make sure...
> @sjrl I have been thinking about whether keeping `_infer_header_levels` as a method makes sense for this. it's only useful in certain cases, the algorithm does not perfectly recreate document...