join20252

Results 1 issues of join20252

Starting training..., iters: 1000 Iter 1: Val loss 9.100, Val took 5657.756s Iter 10: Train loss 9.754, Learning Rate 1.000e-05, It/sec 0.005, Tokens/sec 6.189, Trained Tokens 12868, Peak mem 81.864...