llama.cpp issues

Build linux CUDA releases suitable for Colab & other platforms on 12.2

2

- Building specific archs separately to get maximum performance, smallest package size & shortest built times possible (compare [a build for 7.5+8.0](https://download.ochafik.com/llama.cpp/llama-cpp-master-cuda-12.2-cap-7.5_8.0.zip) vs. [just 7.5](https://download.ochafik.com/llama.cpp/llama-cpp-master-cuda-12.2-cap-7.5.zip) for instance: `libggml-cuda.so` is almost...

ochafik

devops

ggml

llama-tts

* enabled optional cmdline argument '-o' on tts to specify an output filename. It defaults to 'output.wav'. * program now returns ENOENT in case of file write failure

marcoStocchi

examples

Eval bug: Error when converting moonlight from bf16 to q4km

1

### Name and Version /build/bin/llama-quantize /mnt/data/model/Moonlight-16B-A3B-Instruct/Moonlight-16B-A3B-Instruct-BF16.gguf /mnt/data/model/Moonlight-16B-A3B-Instruct/Moonlight-16B-A3B-Instruct-Q4_K_M.gguf Q4_K_M ### Operating systems Linux ### GGML backends CUDA ### Hardware RTX 4090D ### Models Moonlight-16B-A3B-Instruct ### Problem description & steps to reproduce...

qiyuxinlin

bug-unconfirmed

Misc. bug: --no-context-shift OR --context-shift ?

### Name and Version llama-cli.exe --version version: 4735 (73e2ed3c) built with MSVC 19.42.34436.0 for x64 ### Operating systems Windows ### Which llama.cpp modules do you know to be affected? llama-server...

simonchen

bug-unconfirmed

llama.cpp
llama.cpp copied to clipboard

Metadata

Build linux CUDA releases suitable for Colab & other platforms on 12.2

llama-tts

Eval bug: Error when converting moonlight from bf16 to q4km

Misc. bug: --no-context-shift OR --context-shift ?

← Metadata

Owner

Metadata

llama.cpp llama.cpp copied to clipboard

Metadata

Build linux CUDA releases suitable for Colab & other platforms on 12.2

llama-tts

Eval bug: Error when converting moonlight from bf16 to q4km

Misc. bug: --no-context-shift OR --context-shift ?

← Metadata

Owner

Metadata

llama.cpp
llama.cpp copied to clipboard