Phil H
Results
2
issues of
Phil H
Adds a new text field for llama.cpp and llamacpp_HF loaders to implement llama.cpp's `--override-kv`. depends abetlen/llama-cpp-python#1011
When layers are offloaded with CUDA, sending identical requests to the examples/server completion API returns a different response the "first time": ``` $ for x in `seq 5`; do curl...
bug
good first issue