ddh0

Results 41 comments of ddh0

Meta has updated their repos to specify that both stop tokens should be used as EOS: https://github.com/ggerganov/llama.cpp/pull/6745#issuecomment-2066827777

But you shouldn't have to hack the metadata to get the model to work as intended... And in any case I think this could be a useful feature in general,...

> I'll implement 1. as well to add support for multiple stop token ids if anyone can link a gguf file with that metadata. If I understand correctly the llama.cpp...

A little bit of digging and I found [this comment](https://github.com/ROCm/HIP/issues/955#issuecomment-471863747) which says you'll need to set the CMAKE_MODULE_PATH variable. See [here](https://github.com/ROCm-Developer-Tools/HIP/tree/master/samples/2_Cookbook/12_cmake_hip_add_executable#including-findhip-cmake-module-in-the-project) for more details. If this doesn't work, could you...

llama.cpp is not designed to run Stable Diffusion GGUF files. You'll need to use something like [city96/ComfyUI-GGUF](https://github.com/city96/ComfyUI-GGUF/tree/main).

You can also use this to offload the entire KV cache to GPU while keeping the model on CPU: `-ngl 999 -ot "^.*$=CPU"`

Looks like `llama_model_get_tensor` was removed from the API but that change was not documented here

> I didn't expect that this function is being used by anyone, so I skipped updating the changelog. It's updated now. > > Btw, what do you use this call...

PR #11397 added `llama_model_tensor_buft_override` to `llama_model_params`, causing a segmentation fault when trying to load a llama model due to this line in src/llama-model.cpp: ``` pimpl->has_tensor_overrides = params.tensor_buft_overrides && params.tensor_buft_overrides[0].pattern; ```...

PR #16382 updated the `libllama` interface to add `LLAMA_API bool llama_model_is_hybrid(...)`.