llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

llama binary release misses libaries. (ubuntu but perhaps others too)

Open 0wwafa opened this issue 1 year ago • 3 comments

to reproduce:

latest=`curl -qsI "https://github.com/ggml-org/llama.cpp/releases/latest"|grep location|cut -d " " -f 2`
latest=`basename $latest|tr -d "\r"`
#latest='b4000' # <----------- this works but afterwards it does not.
echo Downloading llama.cpp binaries
wget &>/dev/null "https://github.com/ggml-org/llama.cpp/releases/download/$latest/llama-$latest-bin-ubuntu-x64.zip" -O llama.zip
echo Unzipping llama.cpp binaries
unzip &>/dev/null llama.zip

./build/bin/llama-quantize

./build/bin/llama-quantize: error while loading shared libraries: libllama.so: cannot open shared object file: No such file or directory

0wwafa avatar Feb 15 '25 18:02 0wwafa

Use:

LD_LIBRARY_PATH=build/bin ./build/bin/llama-quantize

ggerganov avatar Feb 15 '25 19:02 ggerganov

last time I checked there was no libllama.so in ./build/bin/

0wwafa avatar Feb 16 '25 02:02 0wwafa

I copy-paste the exact command you provided and it works. Just need to cd ./build/bin

user@r-ngxson-my-cloud-pc-trgkc43y-4e70a-5f3q0:~/app$ nano test.sh
user@r-ngxson-my-cloud-pc-trgkc43y-4e70a-5f3q0:~/app$ cat test.sh 
latest=`curl -qsI "https://github.com/ggml-org/llama.cpp/releases/latest"|grep location|cut -d " " -f 2`
latest=`basename $latest|tr -d "\r"`
#latest='b4000' # <----------- this works but afterwards it does not.
echo Downloading llama.cpp binaries
wget &>/dev/null "https://github.com/ggml-org/llama.cpp/releases/download/$latest/llama-$latest-bin-ubuntu-x64.zip" -O llama.zip
echo Unzipping llama.cpp binaries
unzip &>/dev/null llama.zip

user@r-ngxson-my-cloud-pc-trgkc43y-4e70a-5f3q0:~/app$ chmod +x test.sh 
user@r-ngxson-my-cloud-pc-trgkc43y-4e70a-5f3q0:~/app$ ./test.sh 
Downloading llama.cpp binaries
Unzipping llama.cpp binaries
user@r-ngxson-my-cloud-pc-trgkc43y-4e70a-5f3q0:~/app/build/bin$ ls
LICENSE                        llama-eval-callback            llama-lookup-merge     llama-simple-chat         test-grammar-integration
LICENSE.linenoise.cpp          llama-export-lora              llama-lookup-stats     llama-speculative         test-grammar-parser
libggml-base.so                llama-gbnf-validator           llama-minicpmv-cli     llama-speculative-simple  test-json-schema-to-grammar
libggml-cpu.so                 llama-gen-docs                 llama-parallel         llama-tokenize            test-llama-grammar
libggml-rpc.so                 llama-gguf                     llama-passkey          llama-tts                 test-log
libggml.so                     llama-gguf-hash                llama-perplexity       llama-vdot                test-model-load-cancel
libllama.so                    llama-gguf-split               llama-q8dot            rpc-server                test-quantize-fns
libllava_shared.so             llama-gritlm                   llama-quantize         test-arg-parser           test-quantize-perf
llama-batched                  llama-imatrix                  llama-quantize-stats   test-autorelease          test-rope
llama-batched-bench            llama-infill                   llama-qwen2vl-cli      test-backend-ops          test-sampling
llama-bench                    llama-llava-cli                llama-retrieval        test-barrier              test-tokenizer-0
llama-cli                      llama-llava-clip-quantize-cli  llama-run              test-c                    test-tokenizer-1-bpe
llama-convert-llama2c-to-ggml  llama-lookahead                llama-save-load-state  test-chat                 test-tokenizer-1-spm
llama-cvector-generator        llama-lookup                   llama-server           test-chat-template
llama-embedding                llama-lookup-create            llama-simple           test-gguf
user@r-ngxson-my-cloud-pc-trgkc43y-4e70a-5f3q0:~/app/build/bin$ ./llama-quantize
usage: ./llama-quantize [--help] [--allow-requantize] [--leave-output-tensor] [--pure] [--imatrix] [--include-weights] [--exclude-weights] [--output-tensor-type] [--token-embedding-type] [--override-kv] model-f32.gguf [model-quant.gguf] type [nthreads]
...

ngxson avatar Feb 16 '25 14:02 ngxson

Yep. It works in the latest versions.

0wwafa avatar Mar 13 '25 11:03 0wwafa