Georgi Gerganov

Results 163 issues of Georgi Gerganov

target: #8526 Straightforward Metal implementation of `SSM_CONV` and `SSM_SCAN` using single-threaded kernels, mimicking the CPU implementation. Lot's of room for further optimizations, for now assuring correctness ```bash ./llama-batched \ -m...

testing
Review Complexity : High
ggml

https://github.com/ggerganov/llama.cpp/issues/7805#issuecomment-2227826702 Fallback to `Q4_0` instead of `IQ4_NL` due to lack of implementation in of the latter in some backends - [x] I have read the [contributing guidelines](https://github.com/ggerganov/llama.cpp/blob/master/CONTRIBUTING.md) - Self-reported review...

examples
Review Complexity : Low

fix #8656 Fix system info display when using `llamafile` - [x] I have read the [contributing guidelines](https://github.com/ggerganov/llama.cpp/blob/master/CONTRIBUTING.md) - Self-reported review complexity: - [ ] Low - [ ] Medium -...

ggml

It seems like DirectML supports the upcoming NPU-enabled chips for Windows machines: https://devblogs.microsoft.com/directx/introducing-neural-processor-unit-npu-support-in-directml-developer-preview/ I don't think there is any other way to tap into this hardware, so we should explore...

help wanted
research 🔬
roadmap

Mamba-2 is a new version of the Mamba architecture: - Blog: https://tridao.me/blog/2024/mamba2-part1-model/ - Paper: https://arxiv.org/abs/2405.21060

model
research 🔬

fix #8677 - [x] I have read the [contributing guidelines](https://github.com/ggerganov/llama.cpp/blob/master/CONTRIBUTING.md) - Self-reported review complexity: - [ ] Low - [ ] Medium - [ ] High

ref https://github.com/ggerganov/llama.cpp/pull/7599#discussion_r1712901202 - [x] I have read the [contributing guidelines](https://github.com/ggerganov/llama.cpp/blob/master/CONTRIBUTING.md) - Self-reported review complexity: - [ ] Low - [ ] Medium - [ ] High

examples

Currently the [ggml](https://github.com/ggerganov/ggml), [llama.cpp](https://github.com/ggerganov/llama.cpp) and [whisper.cpp](https://github.com/ggerganov/whisper.cpp) projects share the same source of the `ggml` library, but have different CMake scripts. The scripts are adapted to the specifics of the projects...

enhancement
build
refactoring

I think it would be useful to be able to control the font size of the text in the main window. Main use case for me is to reduce the...

The `talk` and `talk.wasm` examples have become a bit stale, using a very old implementation of `gpt-2`: - https://github.com/ggerganov/whisper.cpp/tree/master/examples/talk - https://github.com/ggerganov/whisper.cpp/tree/master/examples/talk.wasm It would be nice to bring those examples up-to-date...

good first issue