Fix streaming chat execution for Ollama

Open mercuriy94 opened this issue 3 months ago • 1 comments

Motivation and Context

Problem:

The executeStreaming method in OllamaClient does not function as expected. It incorrectly waits for the full API response before returning, defeating the purpose of a streaming call. This is caused by the underlying Ktor post call, which resolves to the non-streaming fetchResponse method.

Solution:

This pull request modifies the implementation to use preparePost().execute() instead of the direct .post() method. This pattern ensures that Ktor's fetchStreamingResponse is called, enabling true, asynchronous streaming of the response.

Breaking Changes

There isn't any broken changes.

Type of the changes

[ ] New feature (non-breaking change which adds functionality)
[x] Bug fix (non-breaking change which fixes an issue)
[ ] Breaking change (fix or feature that would cause existing functionality to change)
[ ] Documentation update
[ ] Tests improvement
[ ] Refactoring

Checklist

[x] The pull request has a description of the proposed change
[x] I read the Contributing Guidelines before opening the pull request
[x] The pull request uses develop as the base branch
[x] Tests for the changes have been added
[x] All new and existing tests passed

Additional steps for pull requests adding a new feature

[x] An issue describing the proposed change exists
[x] The pull request includes a link to the issue
[x] The change was discussed and approved in the issue
[x] Docs have been added / updated

Sep 23 '25 14:09 mercuriy94

any progress?

Nov 12 '25 19:11 ApoloApps