jmcmanus15

Results 1 issues of jmcmanus15

I think there may be a subtle bug in `disentangled_attention_bias`. The HuggingFace implementation of this code is a more straightforward reproduction of Eqn (4) from the disentangled attention paper. The...