spring-ai icon indicating copy to clipboard operation
spring-ai copied to clipboard

Improve Ollama IT by caching the image with model

Open eddumelendez opened this issue 1 year ago • 3 comments

Hi,

I have an improvement for OllamaAutoConfigurationIT which can help when working locally. The current issue as described in the test is the model is around 3GB which can take quite a while depending on internet speed.

My approach using the Testcontainers lifecycle: First test execution

  1. Pull image from registry
  2. Start container
  3. Once the container has started, pull the model
  4. Test execution
  5. When test is finishing, commit the changes in the container and create a local image.
  6. Stop container, stop test

Second test execution

  1. Start container using the local image that already contains the model
  2. Test execution
  3. Stop container, stop test

If there is interest on this, I can provide the PR.

eddumelendez avatar Nov 21 '23 22:11 eddumelendez

How would this work with github actions?

markpollack avatar Dec 04 '23 21:12 markpollack

It can be used along with actions/docker-cache and create a hash for a new OllamaImage.java, which only has static final String OLLAMA_IMAGE = "ollama/ollama:0.1.10";

eddumelendez avatar Dec 05 '23 17:12 eddumelendez

Hi. Sorry for the late reply and Happy New Year!

I'd very much like to explore this as using the local models is a very effective way to achieve several goals for an easy sort of integration testing environment that is consistent for testing higher level ai workflows.

Also, we don't need to pay as much to do this in github actions vs. paying for some always on hosting of an AI model on Azure or HuggingFace.

markpollack avatar Jan 12 '24 02:01 markpollack

Hi @eddumelendez , I'm a late comer to this thread. Can you please elaborate on the cache idea? Will the OllamaImage cache the pulled Ollama models as well? or just the docker image?

tzolov avatar Feb 10 '24 18:02 tzolov

a new image with the model in it would be created and cached. I'll be creating an example in order to test the GHA

eddumelendez avatar Feb 10 '24 19:02 eddumelendez

@eddumelendez, looking forward do see your example and findings.

tzolov avatar Feb 11 '24 18:02 tzolov

@tzolov I have create a simple demo project to demonstrate the proposal

As you can see, 2nd build is much faster. 1st build caches the image built by testcontainers and it is reused in the 2nd build.

LMK if this can help and I have a PR almost ready to submit

eddumelendez avatar Feb 12 '24 20:02 eddumelendez

Thanks! It sounds promising Please fire the PR and lets continue the investigation/ discussion for there.

tzolov avatar Feb 13 '24 12:02 tzolov

ple-example $ ollama run openhermes pulling manifest pulling 54ee2a70d129... 99% ▕███████████████████ ▏ 4.1 GB/4.1 GB
Error: max retries exceeded: 400: <Error><Code>InvalidArgument</Code><Message>Invalid Argument: range must be positive.</Message></Error> how i clear cache ???? or fix this bs error! i fixed it by doing this command ollama pull openhermes

jamieduk avatar Mar 27 '24 08:03 jamieduk