text-generation-webui
text-generation-webui copied to clipboard
add KV override field for llama.cpp loaders
Adds a new text field for llama.cpp and llamacpp_HF loaders to implement llama.cpp's --override-kv
.
depends abetlen/llama-cpp-python#1011
Does it work with updated llama-cpp-python 0.2.29? Could you merge the dev branch?
llama_cpp.Llama
now expects a Dict[str,Union[bool,int,float]]
for kv_overrides. Wondering if there's a better way in Gradio to accept these params other than string parsing (although that is consistent with llama.cpp CLI).
On looking into this and latest llama-cpp-python changes I see other parameters that may be interesting to include in webui (eg. split_mode to adjust layer allocation with multi GPU).
Perhaps a more generic way to pass in any experimental parameters directly to llama_cpp.Llama
constructor (parsing with ast.literal_eval
?) would avoid this lengthy process of features needing interface changes in llama.cpp -> llama-cpp-python -> text-generation-webui. Any thoughts?
This sounds good, it would solve the 'change number of Mixtral experts' issue raised here:
https://github.com/oobabooga/text-generation-webui/discussions/5367
Changed to build a dict from the gradio field so this works with current llama-cpp-python.