DeepSpeedExamples
DeepSpeedExamples copied to clipboard
deepspeed-chat: print mean stage1/2 loss periodically
Print mean loss periodically based on deepspeed 'steps_per_print' configuration. So, mean loss is printed on an optimizer step boundary. To reduce log clutter, only rank 0 loss is printed.
This commit modifies current print_loss functionality of stage1:
- Print mean loss at optimizer step boundary instead of at every micro-step
- Print periodically based on ds_config['steps_per_print']
- Print only at global rank 0
The commit adds print_loss functionality for stage2.
Change-Id: I430d88cbbbbb2dd2fe7784dbadac69e522d5a192
@mosheisland, apologies for the delay in merging this PR. Can you please help resolve the conflicts? Thanks!
@mosheisland, apologies for the delay in merging this PR. Can you please help resolve the conflicts? Thanks!
Hi @mosheisland - could you review the merge conflicts and we can get this merged?