Sigbjørn Skjæret comments

Results 71 comments of


                                            Sigbjørn Skjæret

I use python's llama-cpp package to run the code. There is a cuda environment and the contents of llama.cpp (compiled), but I still cannot use the GPU.

You're almost there. :) First off installing CMake and setting variables is not necessary if you're going to use the pip wheels (only if you are building llama.cpp, but then...

Models with multiple chat templates

Any suggestions on how to approach this? It has been merged in llama.cpp a while now, and many GGUFs already have the new metadata. I suppose adding f.ex. a chat_template_name...

Support multiple chat templates - step 2

It might be a better idea to add `documents` as a direct parameter due to huggingface/transformers#30621 We can still keep `template_kwargs` for future use.

Models with multiple chat templates

Sure, [pmysl](https://huggingface.co/pmysl/c4ai-command-r-plus-GGUF) was the first one to update their quants. If R+ is a bit too hefty, try [LlamaEdge](https://huggingface.co/second-state/C4AI-Command-R-v01-GGUF)'s Command R quant. My main worry about using chat_format is that...

Models with multiple chat templates

@abetlen That seems reasonable, I'm thinking registering `chat_template.default` etc. as chat format at init with the Jinja2 handler setup done as fallback today and then just fall back to `chat_template.default`(if...

Models with multiple chat templates

WIP changes worth paying attention to: huggingface/transformers#30621

Models with multiple chat templates

Another related PR is this one huggingface/transformers#31429 which could be nice to replicate here, however requires us to differentiate from specifically selecting `chat_template.default` and defaulting to it as we may...

Add the Command R chat format

This should not strictly be necessary as recent GGUFs have the chat format embedded (which will be automatically applied through Jinja2ChatFormatter), I've submitted a request in older repos on HF...

Add the Command R chat format

@uncodecomplexsystems As you say, it's just a minor merge, I'm not opposed to it, I'm just saying it's not strictly necessary. :)

Add the Command R chat format

> I thought those with `None` were fails, but do they actually get their chat format correctly from the template? Yes, None means it found an embedded template (that is...