Use Ollama native batch endpoint for multiple prompts

Open ajroetker opened this issue 7 months ago • 1 comments

PR Checklist

[x] Read the Contributing documentation.
[x] Read the Code of conduct documentation.
[x] Name your Pull Request title clearly, concisely, and prefixed with the name of the primarily affected package you changed according to Good commit messages (such as memory: add interfaces for X, Y or util: add whizzbang helpers).
[x] Check that there isn't already a PR that solves the problem the same way to avoid creating a duplicate.
[x] Provide a description in this PR that addresses what the PR is solving, or reference the issue that it solves (e.g. Fixes #123).
[x] Describes the source of new concepts.
[x] References existing implementations as appropriate.
[x] Contains test coverage for new functions.
[x] Passes all golangci-lint checks.

This PR adds a CreateEmbeddings call to the Ollama provider that uses the native batching endpoint provided by Ollama which is much more performant when generating embeddings in big chunks.

May 20 '25 19:05 ajroetker

Wasn't sure if you would want the previous implementation kept around for older ollama compatibility?

May 20 '25 19:05 ajroetker