spring-ai
spring-ai copied to clipboard
Enable Ollama integration test
Currently, test is disable because in every run the image should be downloaded and then pull the model. Now, taking advantage of Testcontainers and a GHA, a new image is created on-the-fly with the model in it and then cached, so, next executions will reuse the cached image instead.
Fixes #121
@eddumelendez , I'm trying to run this locally. The first attempt failed with 404 if not mistaken and a consecutive runs cause:
java.lang.ExceptionInInitializerError
at java.base/java.util.ArrayList.forEach([ArrayList.java:1511](vscode-file://vscode-app/Applications/Visual%20Studio%20Code.app/Contents/Resources/app/out/vs/code/electron-sandbox/workbench/workbench.html))
Caused by: org.testcontainers.containers.ContainerFetchException: Can't get Docker image: RemoteDockerImage(imageName=mistral-ollama/ollama:0.1.10, imagePullPolicy=org.springframework.ai.autoconfigure.ollama.OllamaChatAutoConfigurationIT$OllamaContainer$$Lambda$497/0x000000013a226ac8@1e7f2e0f, imageNameSubstitutor=org.testcontainers.utility.ImageNameSubstitutor$LogWrappedImageNameSubstitutor@1da6ee17)
at org.testcontainers.containers.GenericContainer.getDockerImageName([GenericContainer.java:1364](vscode-file://vscode-app/Applications/Visual%20Studio%20Code.app/Contents/Resources/app/out/vs/code/electron-sandbox/workbench/workbench.html))
at org.testcontainers.containers.GenericContainer.doStart([GenericContainer.java:359](vscode-file://vscode-app/Applications/Visual%20Studio%20Code.app/Contents/Resources/app/out/vs/code/electron-sandbox/workbench/workbench.html))
at org.testcontainers.containers.GenericContainer.start([GenericContainer.java:330](vscode-file://vscode-app/Applications/Visual%20Studio%20Code.app/Contents/Resources/app/out/vs/code/electron-sandbox/workbench/workbench.html))
at org.springframework.ai.autoconfigure.ollama.OllamaChatAutoConfigurationIT.([OllamaChatAutoConfigurationIT.java:69](vscode-file://vscode-app/Applications/Visual%20Studio%20Code.app/Contents/Resources/app/out/vs/code/electron-sandbox/workbench/workbench.html))
... 2 more
Caused by: com.github.dockerjava.api.exception.NotFoundException: Status 404: {"message":"pull access denied for mistral-ollama/ollama, repository does not exist or may require 'docker login': denied: requested access to the resource is denied"}
May you know what is going on?
Did a second attempt (after removing all docker images) and the error it fails with is :
8:37:40.513 [main] INFO tc.ollama/ollama:0.1.10 -- Image ollama/ollama:0.1.10 pull took PT29.724702S
18:37:40.513 [docker-java-stream-1404029602] INFO tc.ollama/ollama:0.1.10 -- Pull complete. 3 layers, pulled in 28s (downloaded 471 MB at 16 MB/s)
18:37:40.518 [main] INFO tc.ollama/ollama:0.1.10 -- Creating container for image: ollama/ollama:0.1.10
18:37:44.108 [main] INFO tc.ollama/ollama:0.1.10 -- Container ollama/ollama:0.1.10 is starting: 8aff3a52c42c029a229503dc4e50f59e80ee5b1074734df00852b9bd01de8af5
18:37:44.452 [main] INFO tc.ollama/ollama:0.1.10 -- Container ollama/ollama:0.1.10 started in PT3.934304S
18:39:38.730 [main] INFO org.springframework.ai.autoconfigure.ollama.OllamaChatAutoConfigurationIT -- Start pulling the 'mistral ' generative ... would take several minutes ...
18:43:47.751 [main] INFO org.springframework.ai.autoconfigure.ollama.OllamaChatAutoConfigurationIT -- mistral pulling competed!
18:43:49.397 [main] WARN org.springframework.ai.ollama.api.OllamaApi -- [404] Not Found - 404 page not found
Hi @tzolov, I've reviewed and fixed the issue. PR is updated and everything should work as expected now.
Thanks @eddumelendez , just merged it but it seems we don't have the right account for using docker caching ? https://github.com/spring-projects/spring-ai/actions/runs/8070608896 Or is there a different way to resolve this?
Looks like need to check Actions Permissions
under Settings > Actions > General
to allow external actions.
Yeh, got this as well. But i'm not sure how safe is to white list an unverified action?
@eddumelendez i've whitelisted it but now it fails on this
Let me try on my fork.
But i'm not sure how safe is to white list an unverified action?
Totally understand. Another alternative could be have persistent runners
Couldn't reproduce it https://github.com/eddumelendez/spring-ai/actions/runs/8072468166/job/22054281356#step:8:22
Can you enable debug logs, please?
This seems to have been merged in 246ba173fec3ee135bed1d30f2c64117e69e3f28.
@izeye it is merged but disabled because of the https://github.com/spring-projects/spring-ai/pull/322#issuecomment-1967677261 we still have to resolve. So i've left this issue open until we fix it.
Closing as model specific images are not going to be provided by ollama.
We will pick this up as part of a larger effort to have more CI coverage outside the current github action that runs on each commit.
FWIW I just tried my demo in a github action and the ollama models (x2) pulled in 30sec. It's almost not worth trying to cache. If only my network was that fast. It actually takes longer to pull the ollama and chromadb docker images than it does to pull the models.
I also tried using https://docs.docker.com/build/ci/github-actions/cache/ to see if it would help, but it fails to create the cache because (I think) ollama is running as root so the file permissions are fubar.
Update. I got it working with this
@Bean
@ServiceConnection
public ChromaDBContainer chroma() {
return new ChromaDBContainer(DockerImageName.parse("ghcr.io/chroma-core/chroma:0.5.5"));
}
@Bean
@ServiceConnection
public OllamaContainer ollama() throws Exception {
@SuppressWarnings("resource")
OllamaContainer ollama = new OllamaContainer(DockerImageName.parse("ollama/ollama:0.3.2"))
// Not recommended strategy from testcontainers, but the only practical way to
// make it work locally
.withFileSystemBind("ollama", "/root/.ollama", BindMode.READ_WRITE);
return ollama;
}
@Bean
ApplicationRunner runner(OllamaContainer ollama) {
return args -> {
logger.info("Pulling models...");
ollama.execInContainer("ollama", "pull", "albertogg/multi-qa-minilm-l6-cos-v1");
ollama.execInContainer("ollama", "pull", "mistral");
ollama.execInContainer("chmod", "go+r", "-R", "/root/.ollama");
logger.info("...done");
};
}
but creating the cache (on the first run) and unpacking it (subsequently) takes about 1 minute. So it's not efficient for this case. Maybe it would work better with different models or more models. Here's the workflow:
name: Java CI with Maven
on:
push:
branches: [ main ]
jobs:
build:
name: Build and Deploy On Push
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up JDK 17
uses: actions/setup-java@v4
with:
java-version: '17'
distribution: 'temurin'
cache: maven
- name: Cache LLM Data
id: cache-models
uses: actions/cache@v4
with:
path: ollama
key: ${{ runner.os }}-models
- name: Install with Maven
run: |
./mvnw -B install
UPDATE: I also tried caching the docker images with the build-push-action
and it wasn't really any better than the default - my impression was that the cache was not being used for the docker images in the tests. The quickest overall CI run was with only the Maven cache (provided by the setup-java
action), but there wasn't much in it.
The file system bind and the exec in container steps in the test context make a big difference to local development though, so definitely worth including those in the tests here (and docs).