Ziqing Xing

Results 2 issues of Ziqing Xing

When testing with the TransformerEncoderLayer, the computed FLOPS show a strictly linear relationship with the sequence length \( L \), rather than the theoretical \( L^2 \) relationship expected from...