BERT-pytorch chooses 15% of token

chooses 15% of token

Open makcedward opened this issue 6 years ago • 1 comments

From paper, it mentioned

Instead, the training data generator chooses 15% of tokens at random, e.g., in the sentence my dog is hairy it chooses hairy.

It means that 15% of token will be choose for sure.

From https://github.com/codertimo/BERT-pytorch/blob/master/bert_pytorch/dataset/dataset.py#L68, for every single token, it has 15% of chance that go though the followup procedure. Does it aligned with 15% of token will be chosen?

Feb 22 '19 23:02 makcedward

Sorry for the late response, I think you are right. I'll fix it ASAP

Apr 08 '19 13:04 codertimo

BERT-pytorch BERT-pytorch copied to clipboard

chooses 15% of token

BERT-pytorch
BERT-pytorch copied to clipboard