koog
koog copied to clipboard
Fix streaming chat execution for Ollama
Motivation and Context
Problem:
The executeStreaming method in OllamaClient does not function as expected. It incorrectly waits for the full API response before returning, defeating the purpose of a streaming call. This is caused by the underlying Ktor post call, which resolves to the non-streaming fetchResponse method.
Solution:
This pull request modifies the implementation to use preparePost().execute() instead of the direct .post() method. This pattern ensures that Ktor's fetchStreamingResponse is called, enabling true, asynchronous streaming of the response.
Breaking Changes
There isn't any broken changes.
Type of the changes
- [ ] New feature (non-breaking change which adds functionality)
- [x] Bug fix (non-breaking change which fixes an issue)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
- [ ] Documentation update
- [ ] Tests improvement
- [ ] Refactoring
Checklist
- [x] The pull request has a description of the proposed change
- [x] I read the Contributing Guidelines before opening the pull request
- [x] The pull request uses
developas the base branch - [x] Tests for the changes have been added
- [x] All new and existing tests passed
Additional steps for pull requests adding a new feature
- [x] An issue describing the proposed change exists
- [x] The pull request includes a link to the issue
- [x] The change was discussed and approved in the issue
- [x] Docs have been added / updated
any progress?