Eric Curtin comments

Results 479 comments of


                                            Eric Curtin

unusual podman error with CUDA CDI, but nvidia-smi works

Hmmm weird, I hope we can get to the bottom of this

unusual podman error with CUDA CDI, but nvidia-smi works

Should we flip --keep-groups to be on by default?

Should we provide QUIC support when it is done?

There's a fairly portable QUIC implementation now that the likes of curl uses https://github.com/ngtcp2/ngtcp2

Feature Request: Better chat UX for llama-cli

I think the code from: https://github.com/ggml-org/llama.cpp/pull/17554 is the most usable CLI for "chat". It works well and you get all the functionality from the llama-server monolith binary. We should replace:...

Feature Request: Better chat UX for llama-cli

> > and I would like to push forward with this PR regardless of what way this goes medium-term. > > If I understand correctly, by saying this, you mean...

Feature Request: Better chat UX for llama-cli

> > I don't believe the barrier to entry is that high. > > Oh then you don't know how many users have given up on using llama.cpp just because...

Feature Request: Better chat UX for llama-cli

> One more thought came to mind: how flexible are we on keeping the cli C++? I think looking at Python or JavaScript might be worthwhile. For example, leverage the...

Feature Request: Better chat UX for llama-cli

Not against llama.cpp and Docker Model Runner teaming up also...

Feature Request: Better chat UX for llama-cli

> Just want to relay Georgi's comment [#16603 (comment)](https://github.com/ggml-org/llama.cpp/pull/16603#issuecomment-3588688625) here: > > > I am OK with reorganizing the `llama-cli` tool an related if you have specific ideas - feel...

Feat : vLLM - adds object , adds tests , container , init , dataclasses

Thanks for highlighting this @kiview . Yes we are actively working on this, it is not ready, but please feel free to collaborate with us on this: https://github.com/docker/inference-engine-vllm https://github.com/vllm-project/vllm/pull/26160