Pulling models from private OCI Registries
According to #2388 it should be possible to push and pull models to a Docker/OCI registry (without authentication).
Even though it's an unsupported feature, I find it very useful and would like to contribute a short description how to do this.
Potential use cases are
- organisation-internal registries for orgs that limit internet access,
- serving private models,
- running Ollama on air gapped systems, and
- saving bandwidth and download time at edge locations.
I've tried it with a local docker registry: Pushing seems to work, pulling of the manifest works, as well, but pulling the blobs apparently did not work. Here is what I've tried:
Run a local docker registry v2:
docker run -d -p 5000:5000 --restart=always --name registry registry:2
Copy a model and push it to the registry:
ollama cp phi localhost:5000/mitja/phi
ollama push localhost:5000/mitja/phi --insecure
Remove the copied model and pull it, again (that works, I believe because the blobs from the original phi model are still there):
ollama rm localhost:5000/mitja/phi
ollama pull localhost:5000/mitja/phi --insecure
Remove both the copied and the original model, then pull the model from the private registry, again (does not work):
ollama rm phi
ollama rm localhost:5000/mitja/phi
ollama pull localhost:5000/mitja/phi --insecure
Runs into Error: http: no Location header in response
Pull the original model, then the copied model (works):
ollama pull phi
ollama pull localhost:5000/mitja/phi --insecure
ollama run localhost:5000/mitja/phi
Remove the registry container to clean up:
docker stop /registry
docker rm /registry
Did I miss a step or did I make a mistake, or is pushing/pulling the blobs not yet possible?
I now also tried to curl the image manifest, and realised that I get an error when I don't explicitly specify the manifest version in the Accept header. I believe this is because the docker registry defaults to returning schema version 1 manifests.
According to the opencontainers/distribution-spec, the "client SHOULD include an Accept header indicating which manifest content types it supports. In a successful response, the Content-Type header will indicate the type of the returned manifest."
This doesn't work:
curl localhost:5000/v2/mitja/phi/manifests/latest | jq
Result:
{
"errors": [
{
"code": "MANIFEST_INVALID",
"message": "manifest invalid",
"detail": {}
}
]
}
This works:
curl -H "Accept: application/vnd.docker.distribution.manifest.v2+json" localhost:5000/v2/mitja/phi/manifests/latest | jq
Result:
{
"schemaVersion": 2,
"mediaType": "application/vnd.docker.distribution.manifest.v2+json",
"config": {
"mediaType": "application/vnd.docker.container.image.v1+json",
"digest": "sha256:4ce4b16d33a334b872b8cc4f9d6929905d0bfa19bdc90c5cbed95700d22f747f",
"size": 555
},
"layers": [
{
"mediaType": "application/vnd.ollama.image.model",
"digest": "sha256:04778965089b91318ad61d0995b7e44fad4b9a9f4e049d7be90932bf8812e828",
"size": 1602461536
},
{
"mediaType": "application/vnd.ollama.image.license",
"digest": "sha256:7908abcab772a6e503cfe014b6399bd58dea04576aaf79412fa66347c72bdd3f",
"size": 1036
},
{
"mediaType": "application/vnd.ollama.image.template",
"digest": "sha256:774a15e6f1e5a0ccd2a2df78c20139ab688472bd8ed5f1ed3ef6abf505e02d02",
"size": 77
},
{
"mediaType": "application/vnd.ollama.image.system",
"digest": "sha256:3188becd6bae82d66a6a3e68f5dee18484bbe19eeed33b873828dfcbbb2db5bb",
"size": 132
},
{
"mediaType": "application/vnd.ollama.image.params",
"digest": "sha256:0b8127ddf5ee8a3bf3456ad2d4bb5ddbe9927b3bdca10e639f844a12d5b09099",
"size": 42
}
]
}
This isn't the problem, the Ollama code in images.go, Line 958 actually adds the required Accept header.
I checked the docker registry and Ollama server logs. Ollama seems to pull the first layer, starts with the first chunk of the second layer, but then stops.
Ollama log entry:
time=2024-10-19T21:28:33.514+02:00 level=INFO source=download.go:175 msg="downloading 04778965089b in 16 100 MB part(s)"
Server log entry (some info removed for brevity):
http.request.uri="/v2/mitja/phi/blobs/sha256:04778965089b91318ad61d0995b7e44fad4b9a9f4e049d7be90932bf8812e828"
http.request.useragent="ollama/0.3.13 (arm64 darwin) Go/go1.22.5"
http.response.contenttype="application/octet-stream"
http.response.duration=8.012ms
http.response.status=200 http.response.written=1015808
I thought it was something about the chunking, so I tried a tiny embedding model (snowflake-arctic-embed:22m) which is smaller than 100MB, still same errors.
As this is basically the fastest way to check it, here are the Ollama commands with this model and a Docker registry running on localhost:5050:
ollama pull snowflake-arctic-embed:22m
ollama cp snowflake-arctic-embed:22m localhost:5050/library/arctic-embed
ollama push localhost:5050/library/arctic-embed --insecure
ollama rm snowflake-arctic-embed:22m
ollama rm localhost:5050/library/arctic-embed:latest
ollama pull localhost:5050/library/arctic-embed --insecure
Unfortunately, this is all I can try at the moment (I also checked with https, but same problems). Maybe someone able to start Ollama with the debugger would be able to get further.
@mitja I fixed it at https://github.com/ollama/ollama/pull/7474, there were excessive redirect checks that were messing up the request to standard OCI registries. I got it working now here!
Same issue as #5298
We will be enabling the functionality to push and pull models to OCI registries in RamaLama pending completion of the new "podman artifact" command:
https://github.com/containers/ramalama
Seems like we can push the model to the Private Registry without authentication. but pulling model doesn't work.
I am having the same issue. Weird enough, on my case, I can pull llama3.2 from the registry but nothing else:
ollama pull --insecure registry:5000/p/tinyllama pulling manifest Error: http: no Location header in response
ollama pull --insecure registry:5000/p/llama3.2
pulling manifest
pulling dde5aa3fc5ff... 100% ▕███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏ 2.0 GB
pulling 966de95ca8a6... 100% ▕███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏ 1.4 KB
pulling fcc5a6bec9da... 100% ▕███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏ 7.7 KB
pulling a70ff7e570d9... 100% ▕███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏ 6.0 KB
pulling 56bb8bd477a5... 100% ▕███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏ 96 B
pulling 34bb5ab01051... 100% ▕███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏ 561 B
verifying sha256 digest
writing manifest
success
I have checked the docker registry, the files are there, but what is difference between llama3.2 and tinyllama?