llama.cpp
llama.cpp copied to clipboard
LLM inference in C/C++
### Name and Version build: 4761 (cad53fc9) with cc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 for x86_64-linux-gnu ### Operating systems Linux ### Which llama.cpp modules do you know to be affected? llama-server ###...
The the lowest architecture supported by CUDA 12 is Maxwell. And 5.0 is the lowest one in the Maxwell family.
This PR fixes the bug outlined in this issue: https://github.com/ggml-org/llama.cpp/issues/10157 As well as discussed in projects leverage llama cpp like ollama: https://github.com/ollama/ollama/issues/7441 https://github.com/ollama/ollama-python/issues/433 ### Summary In `clip.cpp`, we initialize a...
### Prerequisites - [x] I am running the latest code. Mention the version if possible as well. - [x] I carefully followed the [README.md](https://github.com/ggml-org/llama.cpp/blob/master/README.md). - [x] I searched using keywords...
### Name and Version version: 4526 (a94f3b27) built with cc (Ubuntu 12.3.0-1ubuntu1~22.04) 12.3.0 for x86_64-linux-gnu Not sure from when this started, but before, when using llama-cli with --log-disable, I would...
### Name and Version ./llama-cli --version CANNOT LINK EXECUTABLE "./llama-cli": library "libomp.so" not found: needed by main executable ### Operating systems Other? (Please let us know in description) ### GGML...
According to this https://github.com/ggerganov/llama.cpp/discussions/336#discussioncomment-11184134, there is a new CoreML API and an ANE backend might be possible to implement with latest Apple software/hardware.
``` root@orangepiaipro-20t:/data/llama.cpp# cmake -B build -DGGML_CANN=on -DCMAKE_BUILD_TYPE=release -- Warning: ccache not found - consider installing it for faster compilation or disable this warning with GGML_CCACHE=OFF -- CMAKE_SYSTEM_PROCESSOR: aarch64 -- Including...
im try to convert this ggml to gguf but i got this error .thank you python convert_llama_ggml_to_gguf.py --input "D:\nectec\model\llama-2-13b-chat.ggmlv3.q2_K.bin" --output "D:\nectec\model\llama-2-13b-chat.gguf" INFO:ggml-to-gguf:* Using config: Namespace(input=WindowsPath('D:/nectec/model/llama-2-13b-chat.ggmlv3.q2_K.bin'), output=WindowsPath('D:/nectec/model/llama-2-13b-chat.gguf'), name=None, desc=None, gqa=8, eps='0',...