Ruihan Xu

Results 1 comments of Ruihan Xu

Because the LayerNorm is performed on the last dimension (which need 1d tokens) and Conv is performed on the last two dimensions (which need 2d tokens)