albert
albert copied to clipboard
Probable error on line 306 in `create_pretraining_data.py` for albert
https://github.com/google-research/albert/blob/932b41f0319fbef7efd069d5ff545e3358574e19/create_pretraining_data.py#L306
In line 306, there is appears to be a probable issue.
For random.randint(start, end), the method is end-inclusive.
So, when len(current_chunk) == 2, line 309 would stop at a single iteration.
While this may allow the model to incorporate the single leftover chunk (if it were to be enter the first elif statement in line 339), it will leave the single chunk out of training instances.
Please address this issue.