icefall icon indicating copy to clipboard operation
icefall copied to clipboard

RuntimeError: Too many grads were not finite

Open SSwethaSel0609 opened this issue 1 year ago • 0 comments

I'm trying to finetune the model using zipformer. I'm facing this issue Traceback (most recent call last): File "finetune.py", line 1532, in main() File "finetune.py", line 1525, in main run(rank=0, world_size=1, args=args) File "finetune.py", line 1403, in run train_one_epoch( File "finetune.py", line 1076, in train_one_epoch scaler.step(optimizer) File "/mnt/efs/swetha/ms_exp/icefall_env/lib/python3.8/site-packages/torch/cuda/amp/grad_scaler.py", line 313, in step return optimizer.step(*args, **kwargs) File "/mnt/efs/swetha/ms_exp/icefall_env/lib/python3.8/site-packages/torch/optim/optimizer.py", line 140, in wrapper out = func(*args, **kwargs) File "/mnt/efs/swetha/ms_exp/icefall_env/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(*args, **kwargs) File "/mnt/efs/swetha/ms_exp/icefall/egs/librispeech/ASR/zipformer/optim.py", line 345, in step clipping_scale = self._get_clipping_scale(group, batches) File "/mnt/efs/swetha/ms_exp/icefall/egs/librispeech/ASR/zipformer/optim.py", line 473, in _get_clipping_scale raise RuntimeError("Too many grads were not finite") RuntimeError: Too many grads were not finite

log_error.txt

SSwethaSel0609 avatar Nov 22 '24 08:11 SSwethaSel0609