llama2.c
llama2.c copied to clipboard
Questions about the matmul function in run.c
As a newcomer to transformers, there are some questions about the matrix multiplication operator in run.c, can the author answer them.
In the forward function, it can be seen that the matmul function is called when calculating qkv, but the actual matmul is done with the calculation of a one-dimensional vector multiplied matrix, so the resulting qkv is also a 1*dim vector. But shouldn't qkv be a two-dimensional matrix in the transformer I'm exposed to? There is a little doubt here, I hope the author can answer it!! thanks!