BitNet
BitNet copied to clipboard
How to compile a server.exe?
Functionality similar to LLaMA.cpp HTTP Server. ./llama-server
This can provide you a form of HTTP server: https://github.com/microsoft/BitNet/discussions/110
It's for local use only though, not for production
You can build lllama-server and run it with the current codebase.
To compile the server:
- Go to
3rdparty/llama.cpp - Run:
mkdir build && cmake -B build && cmake --build build --config Release --target llama-server
If the compilation success, the binary will be available in build/bin.
To compile the server:
- first set build server flag
cmake -S . -B build -DLLAMA_BUILD_SERVER=ON
- then reset setup_env
python setup_env.py -md models/BitNet-b1.58-2B-4T -q i2_s
- In the build/bin/ directory you will see llama-server, standard llama-server usage
./build/bin/llama-server -m models/BitNet-b1.58-2B-4T/ggml-model-i2_s.gguf --port 18080 -t 3 -np 2 --prio 3