icefall
icefall copied to clipboard
Exception Ignored in BalancerFunction of scaling.py
I am curious about the reasoning to do nothing when an exception is triggered in BalancerFunction.
https://github.com/k2-fsa/icefall/blob/23913f6afdea59caf703e3ac715852810cd246ad/egs/librispeech/ASR/zipformer/scaling.py#L798-L801
In my case, a CUDA OOM exception was ignored in one of my GPU nodes within this block of code, but training continued without problems. Is the new BalancerFunction supposed to work this way?