Sigbjørn Skjæret
Sigbjørn Skjæret
You're almost there. :) First off installing CMake and setting variables is not necessary if you're going to use the pip wheels (only if you are building llama.cpp, but then...
Any suggestions on how to approach this? It has been merged in llama.cpp a while now, and many GGUFs already have the new metadata. I suppose adding f.ex. a chat_template_name...
It might be a better idea to add `documents` as a direct parameter due to huggingface/transformers#30621 We can still keep `template_kwargs` for future use.
Sure, [pmysl](https://huggingface.co/pmysl/c4ai-command-r-plus-GGUF) was the first one to update their quants. If R+ is a bit too hefty, try [LlamaEdge](https://huggingface.co/second-state/C4AI-Command-R-v01-GGUF)'s Command R quant. My main worry about using chat_format is that...
@abetlen That seems reasonable, I'm thinking registering `chat_template.default` etc. as chat format at init with the Jinja2 handler setup done as fallback today and then just fall back to `chat_template.default`(if...
WIP changes worth paying attention to: huggingface/transformers#30621
Another related PR is this one huggingface/transformers#31429 which could be nice to replicate here, however requires us to differentiate from specifically selecting `chat_template.default` and defaulting to it as we may...
This should not strictly be necessary as recent GGUFs have the chat format embedded (which will be automatically applied through Jinja2ChatFormatter), I've submitted a request in older repos on HF...
@uncodecomplexsystems As you say, it's just a minor merge, I'm not opposed to it, I'm just saying it's not strictly necessary. :)
> I thought those with `None` were fails, but do they actually get their chat format correctly from the template? Yes, None means it found an embedded template (that is...