ramalama
ramalama copied to clipboard
Explore replacing python3 ollama puller with "podman artifact pull"
We wrote an ollama puller from scratch in python3, it's like 2 http requests. But the final request is like a "podman artifact pull", explore if we can replace this pull with "podman artifact pull"
Potential benefits:
- pushing Ollama artifact to OCI registry without conversion steps
- we can take advantage of authentication code in podman and in general the great compatibility podman has with many types of OCI registries
Python3 implementation:
https://github.com/ericcurtin/lm-pull/blob/f688fb83fce2c96efeadb096b6cdaea11e133d4a/lm-pull.py#L262
the current ramalama implementation is based on this but more complex
Alright, I've been working on this recently and while podman artifact pull can totally be used for the ollama registry, the problem is the registry itself.
The registry at registry.ollama.ai doesn't fully adhere to the protocol for pulling images. Just querying the registry status URL, which is the first step in pulling, just gives 404 and that's where the code just stops cause it doesn't know that's there's a registry anyway (it's like a red herring lol).
podman artifact pull registry.ollama.ai/library/smollm
Error: initializing source docker://registry.ollama.ai/library/smollm:latest: pinging container registry registry.ollama.ai: StatusCode: 404, "404 page not found\n"
The not found comes from https://registry.ollama.ai/v2/ and, while the other APIs work (getting manifest and blobs), podman just correctly bails at the first status check.
I can see moving this forward would likely registry the administrators at the ollama registry to basically just have the /v2 endpoint return 200.
I'll start mocking that call for now and see if/how podman behaves and how we can replace the custom code with podman artifact pull.
EDIT:
issues so far:
- the registry doesn't return the correct header mime type and makes containers/image confused as to what to do (patching for now...)
- need to add the ollama media type, for some reason (may be just because I'm patching)
- some more media type edits
EDIT:
works but fails to update the image manifest
./bin/podman artifact pull registry.ollama.ai/library/smollm
Getting image source signatures
Copying blob ca7a9654b546 done |
Copying blob 62fbfd9ed093 done |
Copying blob 6cafb858555d done |
Copying blob cfc7749b96f6 skipped: already exists
Error: creating an updated image manifest: Unknown media type during manifest conversion: "application/vnd.ollama.image.model"
EDIT:
made it work by patching the last thing:
./bin/podman artifact pull registry.ollama.ai/library/smollm
Getting image source signatures
Copying blob 62fbfd9ed093 skipped: already exists
Copying blob 6cafb858555d skipped: already exists
Copying blob cfc7749b96f6 skipped: already exists
Copying blob ca7a9654b546 skipped: already exists
Copying config 6cb452e016 done |
Writing manifest to image destination
I'd say I'll wait for input now :) I'm sure once the registry at ollama actually implements the registry spec, most of the edits I've made aren't needed. And the next step would be to check how to actually grab/play with whatever in the layers (I guess, but doesn't seem to be this specific github issue)
This would be cool, but we need to start handling Artifacts in RamaLama. Currently RamaLama only supports OCI Images
Alright, I've been working on this recently and while
podman artifact pullcan totally be used for the ollama registry, the problem is the registry itself. The registry at registry.ollama.ai doesn't fully adhere to the protocol for pulling images. Just querying the registry status URL, which is the first step in pulling, just gives 404 and that's where the code just stops cause it doesn't know that's there's a registry anyway (it's like a red herring lol).podman artifact pull registry.ollama.ai/library/smollm Error: initializing source docker://registry.ollama.ai/library/smollm:latest: pinging container registry registry.ollama.ai: StatusCode: 404, "404 page not found\n"The not found comes from https://registry.ollama.ai/v2/ and, while the other APIs work (getting manifest and blobs), podman just correctly bails at the first status check.
What if we require the user do this:
ollama://registry.ollama.ai/library/smollm:latest
and make the exception based on the presence of "ollama://" prefix?
I can see moving this forward would likely registry the administrators at the ollama registry to basically just have the
/v2endpoint return 200.I'll start mocking that call for now and see if/how podman behaves and how we can replace the custom code with podman artifact pull.
EDIT:
issues so far:
- the registry doesn't return the correct header mime type and makes containers/image confused as to what to do (patching for now...)
- need to add the ollama media type, for some reason (may be just because I'm patching)
- some more media type edits
EDIT:
works but fails to update the image manifest
./bin/podman artifact pull registry.ollama.ai/library/smollm Getting image source signatures Copying blob ca7a9654b546 done | Copying blob 62fbfd9ed093 done | Copying blob 6cafb858555d done | Copying blob cfc7749b96f6 skipped: already exists Error: creating an updated image manifest: Unknown media type during manifest conversion: "application/vnd.ollama.image.model"EDIT:
made it work by patching the last thing:
./bin/podman artifact pull registry.ollama.ai/library/smollm Getting image source signatures Copying blob 62fbfd9ed093 skipped: already exists Copying blob 6cafb858555d skipped: already exists Copying blob cfc7749b96f6 skipped: already exists Copying blob ca7a9654b546 skipped: already exists Copying config 6cb452e016 done | Writing manifest to image destinationI'd say I'll wait for input now :) I'm sure once the registry at ollama actually implements the registry spec, most of the edits I've made aren't needed. And the next step would be to check how to actually grab/play with whatever in the layers (I guess, but doesn't seem to be this specific github issue)
The layers are actually quite simple, each layer represents a field here:
https://ollama.com/library/smollm:135m
the only field we actually use is the model one (it's just a gguf file). And we will probably keep things that way I think.
My vote would be to open a PR in podman with what you've done so far :)
But I'm open to ideas, our python3-based puller works fine, but podman artifact feels like a more appropriate place to keep the implementation in some ways.
My vote would be to open a PR in podman with what you've done so far :)
I'm torn on this... The problem with ollama and podman right now is the ollama registry misbehaving. If I open a PR in podman (read: containers/image) to patch it the way I did in my testing, it'll be a huge stopgap.
What if we require the user do this:
ollama://registry.ollama.ai/library/smollm:latest
and make the exception based on the presence of "ollama://" prefix?
I don't think podman would be ok with the above, podman artifacts is generic, adding the prefix isn't a nice UI IMO.
All that said, I think we should engage with whoever takes care of the ollama registry and just nudge them into fixing their registry :) that's the simplest and right thing to do. Afterwards, ramalama can just use podman artifacts w/o any patching (minus maybe some header addition to containers/image). But it'll all start with fixing the registry to behave like a proper container registry.
Good luck, the broken registry might be part of the prioritization strategy.
Thought: Make artifacts as the base content for all storage, Whether you pull from ollama, huggingface, file, https or from OCI, just store as artifacts. Currently every tool is building their own content store with none of them shared. If we moved to artifact storage and k8s could mount artifacts into containers as well as Podman, then we would have an easy storage solution.
Then removing listing pushing models just becomes a wrapper around podman artifact.
Another somewhat related thing is we proved:
podman artifact pull docker.io/ai/smollm2
works. But I don't think we put in the effort to connect the dots yet so:
ramalama run docker://docker.io/ai/smollm2
or
ramalama serve docker://docker.io/ai/smollm2
works... Assuming that's the format we agree on for the docker AI format.
Good luck, the broken registry might be part of the prioritization strategy.
Echoing what was said here. There have been pull requests opened in Ollama tool to enhance interop with other types of OCI registries that are not the Ollama registry, so one could push/pull Ollama models to an everyday OCI registry. These efforts have been ignored. It's almost like lock-in to the Ollama registry is part of the strategy. 😊
Good luck, the broken registry might be part of the prioritization strategy.
Echoing what was said here. There have been pull requests opened in Ollama tool to enhance interop with other types of OCI registries that are not the Ollama registry, so one could push/pull Ollama models to an everyday OCI registry. These efforts have been ignored. It's almost like lock-in to the Ollama registry is part of the strategy. 😊
ah, makes sense, I didn't know this!
So, since their registry isn't gonna be compliant anytime soon, I don't see how to make podman artifact work without patching specifically for a non-conformant registry 🤷
I guess the only option I see is having skopeo grow the ability to skip certain registry call, like --do-not-ping-registry-v2 and add custom headers at all --accept-ollama-model etc etc
I think doing things this way for skopeo --do-not-ping-registry-v2, --accept-ollama-model sounds fine.
From the RamaLama perspective we would continue to do these exceptional things based on if:
"ollama://"
is prefixed or not. Like we do today, or if a shortname is passed and it's not in the shortnames file, we default to ollama://.
I think doing things this way for skopeo
--do-not-ping-registry-v2,--accept-ollama-modelsounds fine.
I mean, ollama is almost container registry compatible so if we want to remove some code here that deals with pulling ollama models, the only thing would be to patch skopeo indeed as I don't think podman should work with non-compliant registries.
I think the skopeo use case is less important, We want Podman to be able to treat content as artifacts. as I said above.
https://github.com/containers/ramalama/issues/1112#issuecomment-2958940583
The end goal is to allow podman to keep track of all models and be able to mount the models into containers. Once we start fooling around with skopeo we loose.
Still waiting for https://github.com/containers/podman/pull/26577
A friendly reminder that this issue had no activity for 30 days.
podman 5.6 now has support for podman-remote artiface, which means this can move forward.
A friendly reminder that this issue had no activity for 30 days.