Yutao ZHU

Results 2 issues of Yutao ZHU

In vanilla BERT, we should input attention mask to avoid performing attention on padding positions. However, in this code repository, no attention mask is used. Is there any reason for...

Thanks for your contribution in sharing such an awesome work! It would be grateful if you can also share us with the fine-tuned checkpoint. Regards Z.