Gregory Polyakov
Results
1
comments of
Gregory Polyakov
@rodrigonogueira4 Thanks for your response I tried the hyperparams you suggested: --train_batch_size=4 --accumulate_grad_batches=32 --optimizer=AdamW --lr=3e-5 --weight_decay=5e-5 And so far, the closest result was obtained by training mono-t5 for 9k steps...