Georgi Gerganov issues

Results 163 issues of


                                            Georgi Gerganov

ggml : add SSM Metal kernels

target: #8526 Straightforward Metal implementation of `SSM_CONV` and `SSM_SCAN` using single-threaded kernels, mimicking the CPU implementation. Lot's of room for further optimizations, for now assuring correctness ```bash ./llama-batched \ -m...

testing

Review Complexity : High

ggml

llama : change fallback type IQ4_NL -> Q4_0

https://github.com/ggerganov/llama.cpp/issues/7805#issuecomment-2227826702 Fallback to `Q4_0` instead of `IQ4_NL` due to lack of implementation in of the latter in some backends - [x] I have read the [contributing guidelines](https://github.com/ggerganov/llama.cpp/blob/master/CONTRIBUTING.md) - Self-reported review...

examples

Review Complexity : Low

ggml : add and use ggml_cpu_has_llamafile()

fix #8656 Fix system info display when using `llamafile` - [x] I have read the [contributing guidelines](https://github.com/ggerganov/llama.cpp/blob/master/CONTRIBUTING.md) - Self-reported review complexity: - [ ] Low - [ ] Medium -...

ggml

ggml : add DirectML backend

It seems like DirectML supports the upcoming NPU-enabled chips for Windows machines: https://devblogs.microsoft.com/directx/introducing-neural-processor-unit-npu-support-in-directml-developer-preview/ I don't think there is any other way to tap into this hardware, so we should explore...

help wanted

research 🔬

roadmap

llama : support Mamba-2

Mamba-2 is a new version of the Mamba architecture: - Blog: https://tridao.me/blog/2024/mamba2-part1-model/ - Paper: https://arxiv.org/abs/2405.21060

model

research 🔬

llama : fix build + fix fabs compile warnings

fix #8677 - [x] I have read the [contributing guidelines](https://github.com/ggerganov/llama.cpp/blob/master/CONTRIBUTING.md) - Self-reported review complexity: - [ ] Low - [ ] Medium - [ ] High

py : fix requirements check '==' -> '~='

ref https://github.com/ggerganov/llama.cpp/pull/7599#discussion_r1712901202 - [x] I have read the [contributing guidelines](https://github.com/ggerganov/llama.cpp/blob/master/CONTRIBUTING.md) - Self-reported review complexity: - [ ] Low - [ ] Medium - [ ] High

examples

ggml : unified CMake build

Currently the [ggml](https://github.com/ggerganov/ggml), [llama.cpp](https://github.com/ggerganov/llama.cpp) and [whisper.cpp](https://github.com/ggerganov/whisper.cpp) projects share the same source of the `ggml` library, but have different CMake scripts. The scripts are adapted to the specifics of the projects...

enhancement

build

refactoring

Feature request: change font size

I think it would be useful to be able to control the font size of the text in the main window. Main use case for me is to reduce the...

whisper : update the "talk" example

The `talk` and `talk.wasm` examples have become a bit stale, using a very old implementation of `gpt-2`: - https://github.com/ggerganov/whisper.cpp/tree/master/examples/talk - https://github.com/ggerganov/whisper.cpp/tree/master/examples/talk.wasm It would be nice to bring those examples up-to-date...

good first issue