llama.cpp
llama.cpp copied to clipboard
llama binary release misses libaries. (ubuntu but perhaps others too)
to reproduce:
latest=`curl -qsI "https://github.com/ggml-org/llama.cpp/releases/latest"|grep location|cut -d " " -f 2`
latest=`basename $latest|tr -d "\r"`
#latest='b4000' # <----------- this works but afterwards it does not.
echo Downloading llama.cpp binaries
wget &>/dev/null "https://github.com/ggml-org/llama.cpp/releases/download/$latest/llama-$latest-bin-ubuntu-x64.zip" -O llama.zip
echo Unzipping llama.cpp binaries
unzip &>/dev/null llama.zip
./build/bin/llama-quantize
./build/bin/llama-quantize: error while loading shared libraries: libllama.so: cannot open shared object file: No such file or directory
Use:
LD_LIBRARY_PATH=build/bin ./build/bin/llama-quantize
last time I checked there was no libllama.so in ./build/bin/
I copy-paste the exact command you provided and it works. Just need to cd ./build/bin
user@r-ngxson-my-cloud-pc-trgkc43y-4e70a-5f3q0:~/app$ nano test.sh
user@r-ngxson-my-cloud-pc-trgkc43y-4e70a-5f3q0:~/app$ cat test.sh
latest=`curl -qsI "https://github.com/ggml-org/llama.cpp/releases/latest"|grep location|cut -d " " -f 2`
latest=`basename $latest|tr -d "\r"`
#latest='b4000' # <----------- this works but afterwards it does not.
echo Downloading llama.cpp binaries
wget &>/dev/null "https://github.com/ggml-org/llama.cpp/releases/download/$latest/llama-$latest-bin-ubuntu-x64.zip" -O llama.zip
echo Unzipping llama.cpp binaries
unzip &>/dev/null llama.zip
user@r-ngxson-my-cloud-pc-trgkc43y-4e70a-5f3q0:~/app$ chmod +x test.sh
user@r-ngxson-my-cloud-pc-trgkc43y-4e70a-5f3q0:~/app$ ./test.sh
Downloading llama.cpp binaries
Unzipping llama.cpp binaries
user@r-ngxson-my-cloud-pc-trgkc43y-4e70a-5f3q0:~/app/build/bin$ ls
LICENSE llama-eval-callback llama-lookup-merge llama-simple-chat test-grammar-integration
LICENSE.linenoise.cpp llama-export-lora llama-lookup-stats llama-speculative test-grammar-parser
libggml-base.so llama-gbnf-validator llama-minicpmv-cli llama-speculative-simple test-json-schema-to-grammar
libggml-cpu.so llama-gen-docs llama-parallel llama-tokenize test-llama-grammar
libggml-rpc.so llama-gguf llama-passkey llama-tts test-log
libggml.so llama-gguf-hash llama-perplexity llama-vdot test-model-load-cancel
libllama.so llama-gguf-split llama-q8dot rpc-server test-quantize-fns
libllava_shared.so llama-gritlm llama-quantize test-arg-parser test-quantize-perf
llama-batched llama-imatrix llama-quantize-stats test-autorelease test-rope
llama-batched-bench llama-infill llama-qwen2vl-cli test-backend-ops test-sampling
llama-bench llama-llava-cli llama-retrieval test-barrier test-tokenizer-0
llama-cli llama-llava-clip-quantize-cli llama-run test-c test-tokenizer-1-bpe
llama-convert-llama2c-to-ggml llama-lookahead llama-save-load-state test-chat test-tokenizer-1-spm
llama-cvector-generator llama-lookup llama-server test-chat-template
llama-embedding llama-lookup-create llama-simple test-gguf
user@r-ngxson-my-cloud-pc-trgkc43y-4e70a-5f3q0:~/app/build/bin$ ./llama-quantize
usage: ./llama-quantize [--help] [--allow-requantize] [--leave-output-tensor] [--pure] [--imatrix] [--include-weights] [--exclude-weights] [--output-tensor-type] [--token-embedding-type] [--override-kv] model-f32.gguf [model-quant.gguf] type [nthreads]
...
Yep. It works in the latest versions.