albert
albert copied to clipboard
Description problem of attention_mask.
https://github.com/google-research/albert/blob/a41cf11700c1ed2b7beab0a2649817fa52c8d6e1/modeling.py#L838-L860
There may be a problem with the shape description of attention_mask
in L857~L860, which should be [batch_size, from_seq_length]
.