Judd
Judd
By the way, which mic is numbered as the 0th, 1st, ... 7th channel?
I think the output embedding is associated with current predication of next token. ```c memcpy(embedding_out.data(), (float *) ggml_get_data(embeddings) + (n_embd*(N - 1)), sizeof(float)*n_embd); ``` https://github.com/ggerganov/llama.cpp/blob/fa84c4b3e80199a5683438f062009c031a06c4fa/llama.cpp#LL1655C6-L1655C6
FYI: PR #24.
@iHaagcom are you testing the latest code with PR #24 merged?
My point is that, with the above codes, it can be proved *successfully*. But, in Fig 1, it is stated that it **can't** be proved without the auxiliary construction provided...
A detailed analysis: https://foldl.github.io/2022-10-18-chasing-a-win32-bug/
These models support function calling (without fine-tuning): * ChatGLM3/GLM-4 * Mistral v0.3 * Qwen v1.5 & v2 For Qwen, function calling can be implemented outside the interference application. I have...
Hey, you can have a try with [ChatLLM.cpp](https://github.com/foldl/chatllm.cpp), 2B, 1B and MoE models are all supported. 😊
https://github.com/ggerganov/llama.cpp/issues/5356 is closed for being inactive. So, any updates?
Oh, my fault. I mean `ggml_new_tensor_2d(..., 2, 1)`, then `ggml_n_dims() == 1`.