Xuan Son Nguyen

Results 73 comments of Xuan Son Nguyen

> Is this missing? `{% if loop.index0 == 0 %}{% set content = bos_token + content %}` As explained in https://github.com/ggerganov/llama.cpp/pull/6751#discussion_r1571972422 , BOS token is added by tokenizer, so it...

FYI, I added llama3 to list of supported templates in [wiki page](https://github.com/ggerganov/llama.cpp/wiki/Templates-supported-by-llama_chat_apply_template). This PR looks good to me and should get merged now. The failed CI job (build server) doesn't...

`` is the EOS token, so you don't need to include it in the list of stop words. In short, server will stop generation if it receives EOS token. For...

Yes, it should be removed. If we decide to add EOS token as stop sequence, we will also need to add for other templates (``, ``, ...)

Yes it looks good to me. I’m just wondering if we want to wait for the other PR that allows converting the model, then test the converted model with this...

> Users that want to support a certain template should open a PR and implement it in the framework that we already have Yeah I thought that would be ideal...

> I really do understand the temptation here, but it's best avoided. Thanks for your input. For clarification, I'm **not** saying that my proposal solves all the issues we have...

I came across https://github.com/ollama/ollama/issues/1977 and feeling like we're in the middle of a "war of template". You're right @teleprint-me , there's temptation, but better to avoid it at least in...

> Might just be me, but I slightly prefer the aesthetic / concise legibility of seeing the entire message in context @kaizau I personally prefer having postfix/prefix explicitly, since it...

@sorasoras Yeah I think I'll try that next. For the moment, I couldn't yet tested this PR. Also, I planned to start by simply process layer-by-layer, that way I don't...