llm-foundry icon indicating copy to clipboard operation
llm-foundry copied to clipboard

loss.detach().clone().mean() * (microbatch_size / current_batch_size

Open YixinSong-e opened this issue 1 year ago • 4 comments

When I set moe_loss_weight:0

[rank7]:   File "/home/syx/miniconda3/envs/lmf/lib/python3.11/site-packages/composer/trainer/trainer.py", line 2907, in <lambda>
[rank7]:     **kwargs: self._train_microbatches(microbatches, loss_dict, **kwargs).item(),
[rank7]:               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank7]:   File "/home/syx/miniconda3/envs/lmf/lib/python3.11/site-packages/composer/trainer/trainer.py", line 3075, in _train_microbatches
[rank7]:     microbatch_loss_dict = self._train_microbatch(use_grad_scaling, current_batch_size, is_final_microbatch)
[rank7]:                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank7]:   File "/home/syx/miniconda3/envs/lmf/lib/python3.11/site-packages/composer/trainer/trainer.py", line 3209, in _train_microbatch
[rank7]:     microbatch_loss_dict[k] = loss.detach().clone().mean() * (microbatch_size / current_batch_size)
[rank7]:                               ^^^^^^^^^^^
[rank7]: AttributeError: 'float' object has no attribute 'detach'

YixinSong-e avatar Oct 17 '24 15:10 YixinSong-e