Pull models from OCI registries with a specific UserAgent
A reference implementation:
https://github.com/ggml-org/llama.cpp/pull/15790/files
Please, implement with a LocalAI specific User-Agent string so we can see stats on who is using what client-side tool.
The llama-server User-Agent string is "llama.cpp".
Hi @ericcurtin 👋 we already support pulling images with Dockerhub in https://github.com/mudler/LocalAI/blob/2b9a3d32c9402ef86f56a71562147c250db04f3f/pkg/oci/image.go#L210 - however we don't have a specific User agent
I see, it looks like the structure of the OCI artifact is different, we should try and consolidate on that
I see, it looks like the structure of the OCI artifact is different, we should try and consolidate on that
I would love to! we also support pulling from ollama that makes it a bit more cumbersome to maintain https://github.com/mudler/LocalAI/blob/2b9a3d32c9402ef86f56a71562147c250db04f3f/pkg/oci/ollama.go#L71
I'd wish we stand for a universal way rather than having different implementations. From our side, we have been (ab)using of the OCI concept and have "from scratch" images with only the model files inside, however I'm happy to revisit this!
I see, it looks like the structure of the OCI artifact is different, we should try and consolidate on that
I would love to! we also support pulling from ollama that makes it a bit more cumbersome to maintain
Line 71 in 2b9a3d3
if layer.MediaType == "application/vnd.ollama.image.model" { I'd wish we stand for a universal way rather than having different implementations. From our side, we have been (ab)using of the OCI concept and have "from scratch" images with only the model files inside, however I'm happy to revisit this!
Agree, this is what docker does:
https://www.docker.com/blog/oci-artifacts-for-ai-model-packaging/
I see, it looks like the structure of the OCI artifact is different, we should try and consolidate on that
I would love to! we also support pulling from ollama that makes it a bit more cumbersome to maintain LocalAI/pkg/oci/ollama.go Line 71 in 2b9a3d3 if layer.MediaType == "application/vnd.ollama.image.model" { I'd wish we stand for a universal way rather than having different implementations. From our side, we have been (ab)using of the OCI concept and have "from scratch" images with only the model files inside, however I'm happy to revisit this!
Agree, this is what docker does:
https://www.docker.com/blog/oci-artifacts-for-ai-model-packaging/
Interesting, would be cool if there is a more open approach to this. Besides GGUF files, there are many other formats which are popular too, for instance .safetensors, .onnx, or full checkpoint files. This is why on the first place, I've thought directory at a flattened from scratch rather than trying to implement a standard myself. Would be interesting to understand if iterations of the format would take in consideration other model types.
Yeah I agree, I am especially interested in what we do for .safetensors as it's multi-file and I haven't figured out all the files that may be required there to run on something like vLLM or mlx. Do we just .tar.gz the whole hf repo in a application/vnd.docker.image.rootfs.diff.tar.gzip layer, is compression even worth it?
@ekcasey works on this on the docker side and might have ideas.