Eric Curtin comments

Results 479 comments of


                                            Eric Curtin

unable to run ramalama using --runtime vllm on macOS

But only CPU 😢 Linux aarch64 CPU inferencing works too

cloud native model and annotations in generated OCI image

Off the top of my head no, try and reach out to skopeo, buildah or podman artifact people

[Asahi Linux] undefined: gosseract.NewClient when CGO_ENABLED=0

Could you try and build within a podman-machine Linux VM? It appears as though this is an attempt to build for Linux on macOS

feat: support registry basic auth

We will be enabling the functionality to push and pull models to OCI registries in RamaLama pending completion of the new "podman artifact" command: https://github.com/containers/ramalama

feat: support registry basic auth

@qcu266 @NPhMKgbDNy1M @chlins @alex-vg we have this in docker model runner now and we've put effort into cleaning up the github repo to make it more contributor friendly, please star,...

Provide model info in chat ui & allow multiple models

We have just llama.cpp server and vllm server integrated but are open to llama-cpp-python server also.

Provide model info in chat ui & allow multiple models

This is how it should look when we've multiple models: ``` {"object":"list","data":[{"id":"smollm:360m","object":"model","created":1737073177,"owned_by":"library"},{"id":"smollm:135m","object":"model","created":1736344249,"owned_by":"library"}]} ```

Provide model info in chat ui & allow multiple models

@sallyom I think we should consider this: https://github.com/BerriAI/litellm it's compatible with vllm and llama.cpp, it sits in front of either. (whereas llama-cpp-python only gives us a llama.cpp solution) vllm closed...

Provide model info in chat ui & allow multiple models

https://docs.litellm.ai/docs/providers/vllm

Provide model info in chat ui & allow multiple models

After reading a bit, I'm gonna propose we write our own proxy, litellm won't provide what we need. You can proxy/route/bridge (whichever terminology you prefer) requests to pre-spun up llama-servers...