Gary Mulder

Results 154 comments of Gary Mulder

Probably a duplicate of #40?

1. Are you talking about eval set loss or training loss? 2. Plot both as a function of epoch similar to #63 to see whether you are overfitting or underfitting...

Without a plot it is difficult to say for certain, but you are probably overfitting. Don't train for more than one epoch.

@myeolinmalchi do you have a torrent for the 30B or 65B weights?