ollama icon indicating copy to clipboard operation
ollama copied to clipboard

Pulling models from private OCI Registries

Open mitja opened this issue 1 year ago • 7 comments

According to #2388 it should be possible to push and pull models to a Docker/OCI registry (without authentication).

Even though it's an unsupported feature, I find it very useful and would like to contribute a short description how to do this.

Potential use cases are

  • organisation-internal registries for orgs that limit internet access,
  • serving private models,
  • running Ollama on air gapped systems, and
  • saving bandwidth and download time at edge locations.

I've tried it with a local docker registry: Pushing seems to work, pulling of the manifest works, as well, but pulling the blobs apparently did not work. Here is what I've tried:

Run a local docker registry v2:

docker run -d -p 5000:5000 --restart=always --name registry registry:2

Copy a model and push it to the registry:

ollama cp phi localhost:5000/mitja/phi
ollama push localhost:5000/mitja/phi --insecure

Remove the copied model and pull it, again (that works, I believe because the blobs from the original phi model are still there):

ollama rm localhost:5000/mitja/phi
ollama pull localhost:5000/mitja/phi --insecure

Remove both the copied and the original model, then pull the model from the private registry, again (does not work):

ollama rm phi
ollama rm localhost:5000/mitja/phi
ollama pull localhost:5000/mitja/phi --insecure

Runs into Error: http: no Location header in response

Pull the original model, then the copied model (works):

ollama pull phi
ollama pull localhost:5000/mitja/phi --insecure
ollama run localhost:5000/mitja/phi

Remove the registry container to clean up:

docker stop /registry
docker rm /registry

Did I miss a step or did I make a mistake, or is pushing/pulling the blobs not yet possible?

mitja avatar Oct 17 '24 18:10 mitja

I now also tried to curl the image manifest, and realised that I get an error when I don't explicitly specify the manifest version in the Accept header. I believe this is because the docker registry defaults to returning schema version 1 manifests.

According to the opencontainers/distribution-spec, the "client SHOULD include an Accept header indicating which manifest content types it supports. In a successful response, the Content-Type header will indicate the type of the returned manifest."

This doesn't work:

curl localhost:5000/v2/mitja/phi/manifests/latest | jq 

Result:

{
  "errors": [
    {
      "code": "MANIFEST_INVALID",
      "message": "manifest invalid",
      "detail": {}
    }
  ]
}

This works:

curl -H "Accept: application/vnd.docker.distribution.manifest.v2+json" localhost:5000/v2/mitja/phi/manifests/latest | jq

Result:

{
  "schemaVersion": 2,
  "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
  "config": {
    "mediaType": "application/vnd.docker.container.image.v1+json",
    "digest": "sha256:4ce4b16d33a334b872b8cc4f9d6929905d0bfa19bdc90c5cbed95700d22f747f",
    "size": 555
  },
  "layers": [
    {
      "mediaType": "application/vnd.ollama.image.model",
      "digest": "sha256:04778965089b91318ad61d0995b7e44fad4b9a9f4e049d7be90932bf8812e828",
      "size": 1602461536
    },
    {
      "mediaType": "application/vnd.ollama.image.license",
      "digest": "sha256:7908abcab772a6e503cfe014b6399bd58dea04576aaf79412fa66347c72bdd3f",
      "size": 1036
    },
    {
      "mediaType": "application/vnd.ollama.image.template",
      "digest": "sha256:774a15e6f1e5a0ccd2a2df78c20139ab688472bd8ed5f1ed3ef6abf505e02d02",
      "size": 77
    },
    {
      "mediaType": "application/vnd.ollama.image.system",
      "digest": "sha256:3188becd6bae82d66a6a3e68f5dee18484bbe19eeed33b873828dfcbbb2db5bb",
      "size": 132
    },
    {
      "mediaType": "application/vnd.ollama.image.params",
      "digest": "sha256:0b8127ddf5ee8a3bf3456ad2d4bb5ddbe9927b3bdca10e639f844a12d5b09099",
      "size": 42
    }
  ]
}

mitja avatar Oct 19 '24 09:10 mitja

This isn't the problem, the Ollama code in images.go, Line 958 actually adds the required Accept header.

I checked the docker registry and Ollama server logs. Ollama seems to pull the first layer, starts with the first chunk of the second layer, but then stops.

Ollama log entry:

time=2024-10-19T21:28:33.514+02:00 level=INFO source=download.go:175 msg="downloading 04778965089b in 16 100 MB part(s)"

Server log entry (some info removed for brevity):

http.request.uri="/v2/mitja/phi/blobs/sha256:04778965089b91318ad61d0995b7e44fad4b9a9f4e049d7be90932bf8812e828" 
http.request.useragent="ollama/0.3.13 (arm64 darwin) Go/go1.22.5" 
http.response.contenttype="application/octet-stream" 
http.response.duration=8.012ms 
http.response.status=200 http.response.written=1015808 

I thought it was something about the chunking, so I tried a tiny embedding model (snowflake-arctic-embed:22m) which is smaller than 100MB, still same errors.

As this is basically the fastest way to check it, here are the Ollama commands with this model and a Docker registry running on localhost:5050:

ollama pull snowflake-arctic-embed:22m 
ollama cp snowflake-arctic-embed:22m localhost:5050/library/arctic-embed
ollama push localhost:5050/library/arctic-embed  --insecure
ollama rm snowflake-arctic-embed:22m 
ollama rm localhost:5050/library/arctic-embed:latest
ollama pull localhost:5050/library/arctic-embed --insecure

Unfortunately, this is all I can try at the moment (I also checked with https, but same problems). Maybe someone able to start Ollama with the debugger would be able to get further.

mitja avatar Oct 19 '24 19:10 mitja

@mitja I fixed it at https://github.com/ollama/ollama/pull/7474, there were excessive redirect checks that were messing up the request to standard OCI registries. I got it working now here!

peterwilli avatar Nov 02 '24 23:11 peterwilli

Same issue as #5298

cniweb avatar Jan 09 '25 07:01 cniweb

We will be enabling the functionality to push and pull models to OCI registries in RamaLama pending completion of the new "podman artifact" command:

https://github.com/containers/ramalama

ericcurtin avatar Jan 19 '25 18:01 ericcurtin

Seems like we can push the model to the Private Registry without authentication. but pulling model doesn't work.

bupd avatar Feb 19 '25 08:02 bupd

I am having the same issue. Weird enough, on my case, I can pull llama3.2 from the registry but nothing else:

ollama pull --insecure registry:5000/p/tinyllama pulling manifest Error: http: no Location header in response

ollama pull --insecure registry:5000/p/llama3.2 pulling manifest pulling dde5aa3fc5ff... 100% ▕███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏ 2.0 GB
pulling 966de95ca8a6... 100% ▕███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏ 1.4 KB
pulling fcc5a6bec9da... 100% ▕███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏ 7.7 KB
pulling a70ff7e570d9... 100% ▕███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏ 6.0 KB
pulling 56bb8bd477a5... 100% ▕███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏ 96 B
pulling 34bb5ab01051... 100% ▕███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏ 561 B
verifying sha256 digest writing manifest success

I have checked the docker registry, the files are there, but what is difference between llama3.2 and tinyllama?

danielfl avatar Feb 27 '25 13:02 danielfl