Eric Curtin

Results 87 issues of Eric Curtin

Apparently docker doesn't allow it, going with RAMALAMA as container is sort of redundant anyway.

So it can fit in CI and it's easier to download We removed support for gfx9. Which are roughly AMD GPUs prior to 2019. This saves us a ton of...

We should be close to docker support. But let's get it tested regularly via GitHub CI.

zsh is the default shell on macOS. autocomplete is not working, reported by @rhatdan

good first issue

If there is a way to auto-detect between language model files and asr model files. We should do that, or if that's not possible we should just use a runtime...

enhancement
good first issue

Use UBI9 if possible latest-cuda will go something like this: dnf config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel9/x86_64/cuda-rhel9.repo dnf install -y libcudnn8 nvidia-driver-NVML nvidia-driver-cuda-libs Build with GGML_CUDA=1 latest-rocm will go something like this: https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/native-install/rhel.html...

Discussed here: https://github.com/containers/ramalama/issues/239 https://github.com/abetlen/llama-cpp-python/blob/7c4aead82d349469bbbe7d8c0f4678825873c039/docs/server.md#configuration-and-multi-model-support

Right now we call llama.cpp directly, long-term we should go with either llama.cpp directly or llama-cpp-python. Because maintaining two different llama.cpp backends isn't ideal, they will never be in sync...

We should consolidate our efforts with instructlab and share container base images: https://github.com/instructlab/instructlab/tree/main/containers

quay.io can only automatically build x86_64 aarch64 is important, this is tested to work on Apple Silicon/macOS with podman machine and libkrun. we just need to figure out a way...