Boris Malashenko
Boris Malashenko
Thank you for the great work! I have a question regarding pre-training. Could you please clarify which YAML configuration file should be used to achieve a similar pre-training setup as...
Hello! Are there any plans for Retro/Dupmae implementation for modernbert pre-training? I was able to change couple of argument to start training for Modernbert-base, however grad_norm and loss values are...
If anyone is struggling with installing old dependencies (as I was 😊), here is the Dockerfile contents to ensure a successful start: ``` # Use the official Python 3.7 image...