Nystromformer
Nystromformer copied to clipboard
score of softmax on Text4k; linformer-256 & nystrom-64 doesn't work
Hi,
Thanks for the excellent work!
I found some issues in my humble trials (I didn't change anything in the code):
- using softmax attention on Text4k I got ~63.7 acc instead of 65.02 you posted in your paper.
- again I tried linear attention Text4k I got ~64 acc, it's even higher than vanilla transformer, I wonder did you get the same result from your side?
- the attention types linformer-256 and nystrom-64 doesn't work, the errors are either dimensions mismatching or config key error. It seems like not all the attention types can successfully run when you release the code. Btw I didn't try out all the choices.
Thank you for your time, I look forward to your reply~
Ziwei