Ziqing Xing
Results
2
issues of
Ziqing Xing
When testing with the TransformerEncoderLayer, the computed FLOPS show a strictly linear relationship with the sequence length \( L \), rather than the theoretical \( L^2 \) relationship expected from...