🐛[BUG]: accumulated loss should be divided by accumulation steps to get the mean loss for wandb report

Open yairchn opened this issue 1 year ago • 0 comments

Version

0.5.0

On which installation method(s) does this occur?

Pip

Describe the issue

In the training loop, an accumulated loss is computing in an additive manner here across all steps in the num_accumulation_rounds. This lose should be divided by the value of num_accumulation_rounds to get the mean rather than the sum of loss to be recorded to wandb: loss_accum += loss/num_accumulation_rounds

Minimum reproducible example

No response

Relevant log output

No response

Environment details

No response

Apr 04 '24 16:04 yairchn