RetroMAE icon indicating copy to clipboard operation
RetroMAE copied to clipboard

Dupmae for modernbert

Open BlessedTatonka opened this issue 1 year ago • 0 comments

Hello! Are there any plans for Retro/Dupmae implementation for modernbert pre-training? I was able to change couple of argument to start training for Modernbert-base, however grad_norm and loss values are stuck at 0/nan, so it seems harder to implement. Any advice appreciated.

BlessedTatonka avatar Jan 11 '25 09:01 BlessedTatonka