Phil H

Results 2 issues of Phil H

Adds a new text field for llama.cpp and llamacpp_HF loaders to implement llama.cpp's `--override-kv`. depends abetlen/llama-cpp-python#1011

When layers are offloaded with CUDA, sending identical requests to the examples/server completion API returns a different response the "first time": ``` $ for x in `seq 5`; do curl...

bug
good first issue