Melmaphother

Results 3 comments of Melmaphother

Thanks for your answer, I'll use jupyter Notebook to try it again.

Really? Why rename can make mistake? I have recurrented your code and there is no error. Check out if you have any other problems

> 在下面的代码中, 我觉得应该表明为什么 Q, K, V 向量序列是等于 inputs_embeds 的, 我理解的是注意力机制中的 QKV 是 embedding 与 W_Q 和 W_K , W_V 这三个矩阵相乘得到的, 这三个矩阵也是超参数, 而下面的代码是好像默认 这三个矩阵是单位矩阵. `import torch from math import sqrt >...