Georgi Gerganov

Results 136 issues of Georgi Gerganov

We've recently introduced the `--hf-repo` and `--hf-file` helper args to `common` in https://github.com/ggerganov/llama.cpp/pull/6234: ``` ref #4735 #5501 #6085 #6098 Sample usage: ./bin/main \ --hf-repo TinyLlama/TinyLlama-1.1B-Chat-v0.2-GGUF \ --hf-file ggml-model-q4_0.gguf \ -m...

enhancement
good first issue
examples

Sample usage: ```bash # ggtag-tweet # example: ggtag-tweet 1634602943319080960 ``` ![IMG_1194](https://github.com/rgerganov/ggtag/assets/1991296/e4f763f9-17ea-41c8-a658-94d0eb367b35)

enhancement
good first issue

Making this repo to automatically test the submitted PRs and potentially merge them if they satisfy the challenge requirements was kind of interesting to me so here is a quick...

documentation

**Describe the bug** Running the following command does not print any output to the terminal - it seems like the process just hangs: ```bash tabby serve --device metal --model TabbyML/Codestral-22B...

bug-unconfirmed

**Describe the bug** I'm not 100% sure what is the cause, but it seems the presence of certain unicode characters inside the file somehow messes up the completion requests. Here...

bug

Attempt to move `llama_sampling_context` in `llama` library and update the sampling API to start using it instead of `llama_context` - [x] Remove `LLAMA_API_INTERNAL` TODO: - [ ] Apply https://github.com/ggerganov/llama.cpp/pull/7424 ---...

testing
android
refactoring
examples
Review Complexity : Medium
server

fix #8615 Propose to determine the max number of nodes based on the model info (arch, hparams, etc.) - [x] I have read the [contributing guidelines](https://github.com/ggerganov/llama.cpp/blob/master/CONTRIBUTING.md) - Self-reported review complexity:...

Review Complexity : Low

Change `llama_pos` from `int32_t` to `float` This change might seem unnecessary at first as we are used to think about token positions as integers, but technically nothing prevents these to...

demo
refactoring
Review Complexity : High

Rewrite the logging functionality in `common/log.h` with main goals: - asynchronous logging - log to file should be possible to disable - compile-time verbosity level - colors

enhancement
refactoring

This feature was proposed by @spion in https://github.com/ggerganov/llama.cpp/issues/2813#issuecomment-1694390583 > In some cases, its useful to do constrained evaluation of logits based on a union of possible text values, then pick...

good first issue
generation quality
research 🔬