ramalama icon indicating copy to clipboard operation
ramalama copied to clipboard

Explore replacing python3 ollama puller with "podman artifact pull"

Open ericcurtin opened this issue 8 months ago • 17 comments

We wrote an ollama puller from scratch in python3, it's like 2 http requests. But the final request is like a "podman artifact pull", explore if we can replace this pull with "podman artifact pull"

Potential benefits:

  • pushing Ollama artifact to OCI registry without conversion steps
  • we can take advantage of authentication code in podman and in general the great compatibility podman has with many types of OCI registries

ericcurtin avatar Apr 03 '25 11:04 ericcurtin

Python3 implementation:

https://github.com/ericcurtin/lm-pull/blob/f688fb83fce2c96efeadb096b6cdaea11e133d4a/lm-pull.py#L262

the current ramalama implementation is based on this but more complex

ericcurtin avatar Apr 04 '25 11:04 ericcurtin

Alright, I've been working on this recently and while podman artifact pull can totally be used for the ollama registry, the problem is the registry itself. The registry at registry.ollama.ai doesn't fully adhere to the protocol for pulling images. Just querying the registry status URL, which is the first step in pulling, just gives 404 and that's where the code just stops cause it doesn't know that's there's a registry anyway (it's like a red herring lol).

podman artifact pull registry.ollama.ai/library/smollm   
Error: initializing source docker://registry.ollama.ai/library/smollm:latest: pinging container registry registry.ollama.ai: StatusCode: 404, "404 page not found\n"

The not found comes from https://registry.ollama.ai/v2/ and, while the other APIs work (getting manifest and blobs), podman just correctly bails at the first status check.

I can see moving this forward would likely registry the administrators at the ollama registry to basically just have the /v2 endpoint return 200.

I'll start mocking that call for now and see if/how podman behaves and how we can replace the custom code with podman artifact pull.

EDIT:

issues so far:

  • the registry doesn't return the correct header mime type and makes containers/image confused as to what to do (patching for now...)
  • need to add the ollama media type, for some reason (may be just because I'm patching)
  • some more media type edits

EDIT:

works but fails to update the image manifest

./bin/podman artifact pull registry.ollama.ai/library/smollm
Getting image source signatures
Copying blob ca7a9654b546 done   | 
Copying blob 62fbfd9ed093 done   | 
Copying blob 6cafb858555d done   | 
Copying blob cfc7749b96f6 skipped: already exists  
Error: creating an updated image manifest: Unknown media type during manifest conversion: "application/vnd.ollama.image.model"

EDIT:

made it work by patching the last thing:

./bin/podman artifact pull registry.ollama.ai/library/smollm        
Getting image source signatures
Copying blob 62fbfd9ed093 skipped: already exists  
Copying blob 6cafb858555d skipped: already exists  
Copying blob cfc7749b96f6 skipped: already exists  
Copying blob ca7a9654b546 skipped: already exists  
Copying config 6cb452e016 done   | 
Writing manifest to image destination

I'd say I'll wait for input now :) I'm sure once the registry at ollama actually implements the registry spec, most of the edits I've made aren't needed. And the next step would be to check how to actually grab/play with whatever in the layers (I guess, but doesn't seem to be this specific github issue)

runcom avatar Jun 06 '25 10:06 runcom

This would be cool, but we need to start handling Artifacts in RamaLama. Currently RamaLama only supports OCI Images

rhatdan avatar Jun 07 '25 16:06 rhatdan

Alright, I've been working on this recently and while podman artifact pull can totally be used for the ollama registry, the problem is the registry itself. The registry at registry.ollama.ai doesn't fully adhere to the protocol for pulling images. Just querying the registry status URL, which is the first step in pulling, just gives 404 and that's where the code just stops cause it doesn't know that's there's a registry anyway (it's like a red herring lol).

podman artifact pull registry.ollama.ai/library/smollm   
Error: initializing source docker://registry.ollama.ai/library/smollm:latest: pinging container registry registry.ollama.ai: StatusCode: 404, "404 page not found\n"

The not found comes from https://registry.ollama.ai/v2/ and, while the other APIs work (getting manifest and blobs), podman just correctly bails at the first status check.

What if we require the user do this:

ollama://registry.ollama.ai/library/smollm:latest

and make the exception based on the presence of "ollama://" prefix?

I can see moving this forward would likely registry the administrators at the ollama registry to basically just have the /v2 endpoint return 200.

I'll start mocking that call for now and see if/how podman behaves and how we can replace the custom code with podman artifact pull.

EDIT:

issues so far:

  • the registry doesn't return the correct header mime type and makes containers/image confused as to what to do (patching for now...)
  • need to add the ollama media type, for some reason (may be just because I'm patching)
  • some more media type edits

EDIT:

works but fails to update the image manifest

./bin/podman artifact pull registry.ollama.ai/library/smollm
Getting image source signatures
Copying blob ca7a9654b546 done   | 
Copying blob 62fbfd9ed093 done   | 
Copying blob 6cafb858555d done   | 
Copying blob cfc7749b96f6 skipped: already exists  
Error: creating an updated image manifest: Unknown media type during manifest conversion: "application/vnd.ollama.image.model"

EDIT:

made it work by patching the last thing:

./bin/podman artifact pull registry.ollama.ai/library/smollm        
Getting image source signatures
Copying blob 62fbfd9ed093 skipped: already exists  
Copying blob 6cafb858555d skipped: already exists  
Copying blob cfc7749b96f6 skipped: already exists  
Copying blob ca7a9654b546 skipped: already exists  
Copying config 6cb452e016 done   | 
Writing manifest to image destination

I'd say I'll wait for input now :) I'm sure once the registry at ollama actually implements the registry spec, most of the edits I've made aren't needed. And the next step would be to check how to actually grab/play with whatever in the layers (I guess, but doesn't seem to be this specific github issue)

The layers are actually quite simple, each layer represents a field here:

https://ollama.com/library/smollm:135m

the only field we actually use is the model one (it's just a gguf file). And we will probably keep things that way I think.

My vote would be to open a PR in podman with what you've done so far :)

But I'm open to ideas, our python3-based puller works fine, but podman artifact feels like a more appropriate place to keep the implementation in some ways.

ericcurtin avatar Jun 09 '25 13:06 ericcurtin

My vote would be to open a PR in podman with what you've done so far :)

I'm torn on this... The problem with ollama and podman right now is the ollama registry misbehaving. If I open a PR in podman (read: containers/image) to patch it the way I did in my testing, it'll be a huge stopgap.

What if we require the user do this:

ollama://registry.ollama.ai/library/smollm:latest

and make the exception based on the presence of "ollama://" prefix?

I don't think podman would be ok with the above, podman artifacts is generic, adding the prefix isn't a nice UI IMO.

All that said, I think we should engage with whoever takes care of the ollama registry and just nudge them into fixing their registry :) that's the simplest and right thing to do. Afterwards, ramalama can just use podman artifacts w/o any patching (minus maybe some header addition to containers/image). But it'll all start with fixing the registry to behave like a proper container registry.

runcom avatar Jun 10 '25 09:06 runcom

Good luck, the broken registry might be part of the prioritization strategy.

rhatdan avatar Jun 10 '25 09:06 rhatdan

Thought: Make artifacts as the base content for all storage, Whether you pull from ollama, huggingface, file, https or from OCI, just store as artifacts. Currently every tool is building their own content store with none of them shared. If we moved to artifact storage and k8s could mount artifacts into containers as well as Podman, then we would have an easy storage solution.

Then removing listing pushing models just becomes a wrapper around podman artifact.

rhatdan avatar Jun 10 '25 12:06 rhatdan

Another somewhat related thing is we proved:

podman artifact pull docker.io/ai/smollm2

works. But I don't think we put in the effort to connect the dots yet so:

ramalama run docker://docker.io/ai/smollm2

or

ramalama serve docker://docker.io/ai/smollm2

works... Assuming that's the format we agree on for the docker AI format.

ericcurtin avatar Jun 11 '25 01:06 ericcurtin

Good luck, the broken registry might be part of the prioritization strategy.

Echoing what was said here. There have been pull requests opened in Ollama tool to enhance interop with other types of OCI registries that are not the Ollama registry, so one could push/pull Ollama models to an everyday OCI registry. These efforts have been ignored. It's almost like lock-in to the Ollama registry is part of the strategy. 😊

ericcurtin avatar Jun 11 '25 05:06 ericcurtin

Good luck, the broken registry might be part of the prioritization strategy.

Echoing what was said here. There have been pull requests opened in Ollama tool to enhance interop with other types of OCI registries that are not the Ollama registry, so one could push/pull Ollama models to an everyday OCI registry. These efforts have been ignored. It's almost like lock-in to the Ollama registry is part of the strategy. 😊

ah, makes sense, I didn't know this!

So, since their registry isn't gonna be compliant anytime soon, I don't see how to make podman artifact work without patching specifically for a non-conformant registry 🤷 I guess the only option I see is having skopeo grow the ability to skip certain registry call, like --do-not-ping-registry-v2 and add custom headers at all --accept-ollama-model etc etc

runcom avatar Jun 11 '25 09:06 runcom

I think doing things this way for skopeo --do-not-ping-registry-v2, --accept-ollama-model sounds fine.

From the RamaLama perspective we would continue to do these exceptional things based on if:

"ollama://"

is prefixed or not. Like we do today, or if a shortname is passed and it's not in the shortnames file, we default to ollama://.

ericcurtin avatar Jun 11 '25 09:06 ericcurtin

I think doing things this way for skopeo --do-not-ping-registry-v2, --accept-ollama-model sounds fine.

I mean, ollama is almost container registry compatible so if we want to remove some code here that deals with pulling ollama models, the only thing would be to patch skopeo indeed as I don't think podman should work with non-compliant registries.

runcom avatar Jun 11 '25 10:06 runcom

I think the skopeo use case is less important, We want Podman to be able to treat content as artifacts. as I said above.

https://github.com/containers/ramalama/issues/1112#issuecomment-2958940583

The end goal is to allow podman to keep track of all models and be able to mount the models into containers. Once we start fooling around with skopeo we loose.

rhatdan avatar Jun 11 '25 10:06 rhatdan

Still waiting for https://github.com/containers/podman/pull/26577

rhatdan avatar Jul 22 '25 13:07 rhatdan

A friendly reminder that this issue had no activity for 30 days.

github-actions[bot] avatar Aug 22 '25 00:08 github-actions[bot]

podman 5.6 now has support for podman-remote artiface, which means this can move forward.

rhatdan avatar Aug 26 '25 12:08 rhatdan

A friendly reminder that this issue had no activity for 30 days.

github-actions[bot] avatar Sep 26 '25 00:09 github-actions[bot]