Pavel Voropaev

Results 5 issues of Pavel Voropaev

Signed-off-by: voropz

Signed-off-by: Pavel Voropaev Added the ability to use MT_Eltwise in the TransformerEncoderLayer

Signed-off-by: Pavel Voropaev The behaviour of a single attention head doesn't change with the total number of heads. Just like in the original paper

Signed-off-by: Pavel Voropaev I think it can fix https://github.com/neoml-lib/neoml/issues/757...

https://github.com/neoml-lib/neoml/blob/bbae2779fbce3d757d4a665063a35d8d303b9fef/NeoML/src/Dnn/Layers/PositionalEmbeddingLayer.cpp#L64 In `PET_LearnableAddition` mode this condition is false every time when the sequence has the length different from the previous one. Then `initializeLearnableAddition()` is called, completely resetting the weights without...