LocalAI icon indicating copy to clipboard operation
LocalAI copied to clipboard

Pull models from OCI registries with a specific UserAgent

Open ericcurtin opened this issue 4 months ago • 6 comments

A reference implementation:

https://github.com/ggml-org/llama.cpp/pull/15790/files

Please, implement with a LocalAI specific User-Agent string so we can see stats on who is using what client-side tool.

The llama-server User-Agent string is "llama.cpp".

ericcurtin avatar Sep 12 '25 11:09 ericcurtin

Hi @ericcurtin 👋 we already support pulling images with Dockerhub in https://github.com/mudler/LocalAI/blob/2b9a3d32c9402ef86f56a71562147c250db04f3f/pkg/oci/image.go#L210 - however we don't have a specific User agent

mudler avatar Sep 12 '25 12:09 mudler

I see, it looks like the structure of the OCI artifact is different, we should try and consolidate on that

ericcurtin avatar Sep 12 '25 12:09 ericcurtin

I see, it looks like the structure of the OCI artifact is different, we should try and consolidate on that

I would love to! we also support pulling from ollama that makes it a bit more cumbersome to maintain https://github.com/mudler/LocalAI/blob/2b9a3d32c9402ef86f56a71562147c250db04f3f/pkg/oci/ollama.go#L71

I'd wish we stand for a universal way rather than having different implementations. From our side, we have been (ab)using of the OCI concept and have "from scratch" images with only the model files inside, however I'm happy to revisit this!

mudler avatar Sep 12 '25 12:09 mudler

I see, it looks like the structure of the OCI artifact is different, we should try and consolidate on that

I would love to! we also support pulling from ollama that makes it a bit more cumbersome to maintain

LocalAI/pkg/oci/ollama.go

Line 71 in 2b9a3d3

if layer.MediaType == "application/vnd.ollama.image.model" { I'd wish we stand for a universal way rather than having different implementations. From our side, we have been (ab)using of the OCI concept and have "from scratch" images with only the model files inside, however I'm happy to revisit this!

Agree, this is what docker does:

https://www.docker.com/blog/oci-artifacts-for-ai-model-packaging/

ericcurtin avatar Sep 12 '25 12:09 ericcurtin

I see, it looks like the structure of the OCI artifact is different, we should try and consolidate on that

I would love to! we also support pulling from ollama that makes it a bit more cumbersome to maintain LocalAI/pkg/oci/ollama.go Line 71 in 2b9a3d3 if layer.MediaType == "application/vnd.ollama.image.model" { I'd wish we stand for a universal way rather than having different implementations. From our side, we have been (ab)using of the OCI concept and have "from scratch" images with only the model files inside, however I'm happy to revisit this!

Agree, this is what docker does:

https://www.docker.com/blog/oci-artifacts-for-ai-model-packaging/

Interesting, would be cool if there is a more open approach to this. Besides GGUF files, there are many other formats which are popular too, for instance .safetensors, .onnx, or full checkpoint files. This is why on the first place, I've thought directory at a flattened from scratch rather than trying to implement a standard myself. Would be interesting to understand if iterations of the format would take in consideration other model types.

mudler avatar Sep 12 '25 12:09 mudler

Yeah I agree, I am especially interested in what we do for .safetensors as it's multi-file and I haven't figured out all the files that may be required there to run on something like vLLM or mlx. Do we just .tar.gz the whole hf repo in a application/vnd.docker.image.rootfs.diff.tar.gzip layer, is compression even worth it?

@ekcasey works on this on the docker side and might have ideas.

ericcurtin avatar Sep 12 '25 13:09 ericcurtin