spring-ai Improve Ollama IT by caching the image with model

Hi,

I have an improvement for OllamaAutoConfigurationIT which can help when working locally. The current issue as described in the test is the model is around 3GB which can take quite a while depending on internet speed.

My approach using the Testcontainers lifecycle: First test execution

Pull image from registry
Start container
Once the container has started, pull the model
Test execution
When test is finishing, commit the changes in the container and create a local image.
Stop container, stop test

Second test execution

Start container using the local image that already contains the model
Test execution
Stop container, stop test

If there is interest on this, I can provide the PR.

Nov 21 '23 22:11 eddumelendez

How would this work with github actions?

Dec 04 '23 21:12 markpollack

It can be used along with actions/docker-cache and create a hash for a new OllamaImage.java, which only has static final String OLLAMA_IMAGE = "ollama/ollama:0.1.10";

Dec 05 '23 17:12 eddumelendez

Hi. Sorry for the late reply and Happy New Year!

I'd very much like to explore this as using the local models is a very effective way to achieve several goals for an easy sort of integration testing environment that is consistent for testing higher level ai workflows.

Also, we don't need to pay as much to do this in github actions vs. paying for some always on hosting of an AI model on Azure or HuggingFace.

Jan 12 '24 02:01 markpollack

Hi @eddumelendez , I'm a late comer to this thread. Can you please elaborate on the cache idea? Will the OllamaImage cache the pulled Ollama models as well? or just the docker image?

Feb 10 '24 18:02 tzolov

a new image with the model in it would be created and cached. I'll be creating an example in order to test the GHA

Feb 10 '24 19:02 eddumelendez

@eddumelendez, looking forward do see your example and findings.

Feb 11 '24 18:02 tzolov

@tzolov I have create a simple demo project to demonstrate the proposal

1st build 4m 58s
2nd build 1m 13s

As you can see, 2nd build is much faster. 1st build caches the image built by testcontainers and it is reused in the 2nd build.

LMK if this can help and I have a PR almost ready to submit

Feb 12 '24 20:02 eddumelendez

Thanks! It sounds promising Please fire the PR and lets continue the investigation/ discussion for there.

Feb 13 '24 12:02 tzolov

ple-example $ ollama run openhermes pulling manifest pulling 54ee2a70d129... 99% ▕███████████████████ ▏ 4.1 GB/4.1 GB
Error: max retries exceeded: 400: <Error><Code>InvalidArgument</Code><Message>Invalid Argument: range must be positive.</Message></Error> how i clear cache ???? or fix this bs error! i fixed it by doing this command ollama pull openhermes

Mar 27 '24 08:03 jamieduk

spring-ai spring-ai copied to clipboard

Improve Ollama IT by caching the image with model

spring-ai
spring-ai copied to clipboard