Georgi Gerganov
Georgi Gerganov
target #9707 Adapt the Metal backend to the new registry and device interfaces. - [x] I have read the [contributing guidelines](https://github.com/ggerganov/llama.cpp/blob/master/CONTRIBUTING.md) - Self-reported review complexity: - [ ] Low -...
I was just thinking about this idea, so writing it down for future research. We should be able to fairly easy generate model-specific Metal code that has hardcoded kernels for...
Following up on #2421, I think we should implement some better way to observe at which point of the inference the results start to deviate significantly between the classical and...
# Overview This is a list of changes to the public interface of the `llama` library. Collaborators are encouraged to edit this post in order to reflect important changes to...
## Overview This PR is an intermediate step towards a more generic implementation that will support different underlying implementations of `llama_kv_cache`, `llama_context` and the graph building logic (a.k.a. `llm_build_context`). The...
According to this https://github.com/ggerganov/llama.cpp/discussions/336#discussioncomment-11184134, there is a new CoreML API and an ANE backend might be possible to implement with latest Apple software/hardware.