spring-ai
spring-ai copied to clipboard
Improve Ollama IT by caching the image with model
Hi,
I have an improvement for OllamaAutoConfigurationIT which can help when working locally. The current issue as described in the test is the model is around 3GB which can take quite a while depending on internet speed.
My approach using the Testcontainers lifecycle: First test execution
- Pull image from registry
- Start container
- Once the container has started, pull the model
- Test execution
- When test is finishing, commit the changes in the container and create a local image.
- Stop container, stop test
Second test execution
- Start container using the local image that already contains the model
- Test execution
- Stop container, stop test
If there is interest on this, I can provide the PR.
How would this work with github actions?
It can be used along with actions/docker-cache and create a hash for a new OllamaImage.java
, which only has static final String OLLAMA_IMAGE = "ollama/ollama:0.1.10";
Hi. Sorry for the late reply and Happy New Year!
I'd very much like to explore this as using the local models is a very effective way to achieve several goals for an easy sort of integration testing environment that is consistent for testing higher level ai workflows.
Also, we don't need to pay as much to do this in github actions vs. paying for some always on hosting of an AI model on Azure or HuggingFace.
Hi @eddumelendez , I'm a late comer to this thread. Can you please elaborate on the cache idea? Will the OllamaImage cache the pulled Ollama models as well? or just the docker image?
a new image with the model in it would be created and cached. I'll be creating an example in order to test the GHA
@eddumelendez, looking forward do see your example and findings.
@tzolov I have create a simple demo project to demonstrate the proposal
As you can see, 2nd build is much faster. 1st build caches the image built by testcontainers and it is reused in the 2nd build.
LMK if this can help and I have a PR almost ready to submit
Thanks! It sounds promising Please fire the PR and lets continue the investigation/ discussion for there.
ple-example $ ollama run openhermes
pulling manifest
pulling 54ee2a70d129... 99% ▕███████████████████ ▏ 4.1 GB/4.1 GB
Error: max retries exceeded: 400: <Error><Code>InvalidArgument</Code><Message>Invalid Argument: range must be positive.</Message></Error>
how i clear cache ???? or fix this bs error!
i fixed it by doing this command
ollama pull openhermes