Xuan Son Nguyen

Results 73 comments of Xuan Son Nguyen

Before moving further, @ggerganov could you please take a look on the API design to see if that's OK for you? Thanks.

@teleprint-me @ggerganov Thanks for your feedback. I understand that this part can get complicated easily in the future, so these things are being considered when I made this proposal: 1....

@ggerganov @phymbert I got a weird issue on the CI workflow where the `master` branch get merged automatically to the code on CI. Do you have some clue about that?...

I'm changing this PR to "demo" since I'm still not very confident to make the chat template system become more complicated. Maybe we will re-visit this in the future. This...

@hnfong The reason why I propose to reuse `llama_chat_apply_template` is because I don't want to have 2 different APIs to manage chat templates. If we want to have prefix/postfix system,...

My idea is simply call `llama_chat_apply_template` twice: with and without the last user message. Then, I can find the diff between 2 output strings and feed it into inference. This...

#6795 should work with all templates **except** for templates that does not have support for system prompt (or llama2 with `` for system message). That's why in #6810 I propose...

> chat template logic gets called only once per message Take this case for example: We have 2 messages: - system: You are a helpful assistant - user: Hi, who...

> it is better to update chat-apply-template to add system role wrapping, rather than trying to look through messages multiple times Sorry I don't really get this idea, can you...