dancingacidpanda
Results
1
comments of
dancingacidpanda
train_num_steps is the number of times the gradient is computed (for every train_batch_size * gradient_accumulate_every samples in your dataset) until the model stops training timesteps is "the one from the...