Henry Liu comments

Repositories
Issues
Comments

Results 2 comments of


                                            Henry Liu

[BUG] Example of pretraining BERT does not work

in Megatron-DeepSpeed/megatron/model/bert_model.py，there is a line: ```python extended_attention_mask = bert_extended_attention_mask(attention_mask) ``` which `bert_extended_attention_mask` is define like: ```python def bert_extended_attention_mask(attention_mask): # We create a 3D attention mask from a 2D tensor mask....

[BUG] Example of pretraining BERT does not work

> in Megatron-DeepSpeed/megatron/model/bert_model.py，there is a line: > > ```python > extended_attention_mask = bert_extended_attention_mask(attention_mask) > ``` > > which `bert_extended_attention_mask` is define like: > > ```python > def bert_extended_attention_mask(attention_mask): > #...