Bartowski comments

Results 19 comments of


                                            Bartowski

InternLM2 math model breaks with exllamav2_HF loader (works with non-HF)

@oobabooga sorry to resurrect, but I realized when I use the chat template (chatml) that that's when it breaks. Not using a template allows it to generate. May be a...

InternLM2 math model breaks with exllamav2_HF loader (works with non-HF)

That's interesting, cause the tokenizer config has im_start as 92543, I'll investigate more with that in mind and get back to you

InternLM2 math model breaks with exllamav2_HF loader (works with non-HF)

putting it in added_tokens.json doesn't fix the HF one (though I may rollback to when non-HF was in and try with that) I wonder if what this really means is...

InternLM2 math model breaks with exllamav2_HF loader (works with non-HF)

Yeah so adding the "added_tokens.json" did work for the non-HF loader, im_start gets mapped properly now: 1 : "\" 92543 : "\" 1008 : "user" 364 : "\n" 2661 :...

InternLM2 math model breaks with exllamav2_HF loader (works with non-HF)

Not sure if you noticed this @oobabooga but it seems likely that the fix is to overwrite rather than append the tokens if possible

Show the total number of tokens and generation speed in chat UI (#2243)

@YakuzaSuske not everyone interacts on the same computer it's running on, I also use it on my phone, sometimes I'm curious and I'd rather not open terminal, ssh, find the...

add phi3 support

So is the implementation in https://github.com/ggerganov/llama.cpp/pull/6851 preferred or are both needed for official support?

X1 Support

> I have an X1 Omni, and I plan to get this running at my house. I will update as I go. Did you ever get it working?

GGML_ASSERT: llama.cpp:3817: unicode_cpts_from_utf8(word).size() > 0

seeing this with https://huggingface.co/fblgit/UNA-ThePitbull-21.4-v1 which has the same \u0000 token wonder if the code needs a specific catch for it