Xuan Son Nguyen comments

Results 73 comments of


Xuan Son Nguyen

WIP: Add model `merge` example

- The columns is `model - scale - model - scale` but not `all models then all scales`, can you re-check it? - Maybe also remove the space in CSV,...

WIP: Add model `merge` example

I added a debug message to test if the parser is correct: ``` Parsing configurations: - Layer 0 = + model[0].layer[0]*1 + model[1].layer[0]*0 - Layer 1 = + model[0].layer[1]*0 +...

WIP: Add model `merge` example

Nice, thanks for the info! It's true that I have misalignment somewhere, I'll have a look tonight.

WIP: Add model `merge` example

@dnhkng I rewrite the part where it actually do the calculation. As a side effect, you can now input + output quantized model (yay, that's what you asked for). I...

WIP: Add model `merge` example

I finally get it working. You can now use quant as input and it will be requant (imatrix is not supported, only q4 and up is supported).

Need help in extracting logits (token + probabilities)!

You can base on the `simple.cpp` example, which extract the logits and use greedy method to sort and sample next token: https://github.com/ggerganov/llama.cpp/blob/a0e584defd8c16e7a51ab895f595df0448d710d0/examples/simple/simple.cpp#L128 To read out the list of tokens from...

Update main's interactive mode to use the chat handshake templates support already available in llama.cpp (and currently only used by server,...)

I've made a detailed research on the same subject, so I strongly recommend you to refer to this issue: https://github.com/ggerganov/llama.cpp/issues/6391 Also, a new function named `llama_token_is_eog` will be introduced with...

Xuan Son Nguyen

WIP: Add model `merge` example

WIP: Add model `merge` example

WIP: Add model `merge` example

WIP: Add model `merge` example

WIP: Add model `merge` example

Need help in extracting logits (token + probabilities)!

Update main's interactive mode to use the chat handshake templates support already available in llama.cpp (and currently only used by server,...)

Get chat_template from a server endpoint.

Get chat_template from a server endpoint.

Get chat_template from a server endpoint.