datablations icon indicating copy to clipboard operation
datablations copied to clipboard

Scaling Data-Constrained Language Models

Results 5 datablations issues
Sort by recently updated
recently updated
newest added

Hi, I am reading your paper and I have noticed that figure 4 and figure 15 are exactly the same. Are they meant to be the same? I believe that...

"Scaling Data-Constrained Language Models" is a very nice paper, and I learn a lot from this paper. However, I have a question about this paper: In the abstract and Figure...

This should make it easier for us to investigate scaling laws @TevenLeScao

hi authors, thanks for the great work! i just wonder if LR=1e-3 for mup is optimal value from small-scale proxy model and how dropout is critical for multi-epoch training. for...

Hi @Muennighoff Great paper, very impressive work and very detailed - thanks for releasing the data! I wonder about a small discrepancy that I see between your work and scaling...