LX123123
Results
7
issues of
LX123123
Is the output linear layer parameter of the MultiHeadAttention class incorrectly set in mha.py file? in_features should be heads*d_k?
improvement
Can you provide requirement.txt and data format demo file? thank u
我看这个项目,是由很多模型组成的,这么多模型是分别训练的,还是放一起训练的?应该如何在我们的数据集上进行重新训练呢?