Diego Devesa
Diego Devesa
It's not likely to be an incompatibility with the GPU architecture, in fact the ggml-ci tests every commit on master on a PCIE V100. Whatever the issue is, it seems...
Could you add some documentation about how to use the `CMakePresets.json` file? A comment in the PR description is enough. If I understand correctly, this is not being used in...
I am not sure that we need to make changes to accommodate what seems to be a buggy or misconfigured VS Code extension. FWITW I use VS Code, but not...
It seems that the Kompute backend is missing f32 get_rows. It already supports f16, so hopefully it will be easy to add f32.
The changes to the llama.h public structs are effectively a API breaking change for no real benefit. The other structs are less sensitive since they are internal to ggml, but...
I am not sure that this change is necessary, or the color code stuff.
I don't think that people working enough on the llama.cpp code to benefit from ccache need to be reminded to run `make clean` before disabling it, and the link to...
There are more leaks in this function. #6289 has the fixes.
We do not want to apply optimizations that serve no real purpose but make the code harder to read. And anyway, the compiler is perfectly capable of optimizing a `strlen(s)...
You would need to modify `ggml_backend_registry_init` to register the backend, then it should be automatically used by `test-backend-ops`. https://github.com/ggerganov/llama.cpp/blob/54770413c484660d021dd51b5dbacab7880b8827/ggml-backend.c#L411