Georgi Gerganov
Georgi Gerganov
We've recently introduced the `--hf-repo` and `--hf-file` helper args to `common` in https://github.com/ggerganov/llama.cpp/pull/6234: ``` ref #4735 #5501 #6085 #6098 Sample usage: ./bin/main \ --hf-repo TinyLlama/TinyLlama-1.1B-Chat-v0.2-GGUF \ --hf-file ggml-model-q4_0.gguf \ -m...
Sample usage: ```bash # ggtag-tweet # example: ggtag-tweet 1634602943319080960 ``` data:image/s3,"s3://crabby-images/c2257/c2257ee29c4e718df403aa74dc6028f2a9726615" alt="IMG_1194"
Making this repo to automatically test the submitted PRs and potentially merge them if they satisfy the challenge requirements was kind of interesting to me so here is a quick...
**Describe the bug** Running the following command does not print any output to the terminal - it seems like the process just hangs: ```bash tabby serve --device metal --model TabbyML/Codestral-22B...
**Describe the bug** I'm not 100% sure what is the cause, but it seems the presence of certain unicode characters inside the file somehow messes up the completion requests. Here...
Attempt to move `llama_sampling_context` in `llama` library and update the sampling API to start using it instead of `llama_context` - [x] Remove `LLAMA_API_INTERNAL` TODO: - [ ] Apply https://github.com/ggerganov/llama.cpp/pull/7424 ---...
fix #8615 Propose to determine the max number of nodes based on the model info (arch, hparams, etc.) - [x] I have read the [contributing guidelines](https://github.com/ggerganov/llama.cpp/blob/master/CONTRIBUTING.md) - Self-reported review complexity:...
Change `llama_pos` from `int32_t` to `float` This change might seem unnecessary at first as we are used to think about token positions as integers, but technically nothing prevents these to...
Rewrite the logging functionality in `common/log.h` with main goals: - asynchronous logging - log to file should be possible to disable - compile-time verbosity level - colors
This feature was proposed by @spion in https://github.com/ggerganov/llama.cpp/issues/2813#issuecomment-1694390583 > In some cases, its useful to do constrained evaluation of logits based on a union of possible text values, then pick...