Sebastian Husch Lee comments

Results 235 comments of


                                            Sebastian Husch Lee

feat: Add `failed_sources` to the output of converters

> When this happens, it would be helpful to route the PDF to another converter, for example. This could also apply to empty PDF content, which might be scanned documents...

feat: Add `failed_sources` to the output of converters

> [@sjrl](https://github.com/sjrl) I'll give `DocumentLengthRouter` a try. Just curious, what's the reason for not enabling a second list of empty/failed files in the converters? Also, I will share these files...

Enable/refine reasoning support

@anakin87 as a follow up for issues that haven't been closed yet we should probably also try and adopt using the new `reasoning` field of `StreamingChunk` as well. Opened issue...

haystack-ai cycle in dependency

hey @dragonTalon could you provide a code snippet that caused the issue?

Fix: Update Agent's docstrings

@agnieszka-m I'm also wondering if there is a way we could make the yaml part a more complete example. For example, this requires the core integration [`mcp-haystack`](https://pypi.org/project/mcp-haystack/) to be installed...

Fix: Update Agent's docstrings

@agnieszka-m feel free to update this PR with the changes we discussed offline!

Drop Python 3.9 support because of Python 3.9 EOL

Additionally we should: - Consider updating these files which have code specific to handling types in python 3.9. I say consider since if we remove support for deserializing and comparing...

feat: MarkdownHeaderSplitter

@OGuggenbuehl definitely looks like an interesting approach! I've left an initial set of comments, but to further review I'd appreciate if you could add a set of tests like the...

feat: MarkdownHeaderSplitter

Thanks for your continued work on this @OGuggenbuehl! Some general comments. Could you: - Add a release note for this PR following the instructions [here](https://github.com/deepset-ai/haystack/blob/main/CONTRIBUTING.md#release-notes) - Could you make sure...

feat: MarkdownHeaderSplitter

> @sjrl I have been thinking about whether keeping `_infer_header_levels` as a method makes sense for this. it's only useful in certain cases, the algorithm does not perfectly recreate document...