wenet 貌似torch.autocast和deepspeed不能直接融合，运行实例会报错

Describe the bug A clear and concise description of what the bug is. 使用deepspeed训练examples/aishell/whisper，会报错：

python3.10/site-packages/deepspeed/runtime/torch_autocast.py", line 97, in validate_nested_autocast raise AssertionError( AssertionError: torch.autocast is enabled outside DeepSpeed, but not in the DeepSpeed config. Please enable torch.autocast through the DeepSpeed config to ensure the correct communication dtype is used.

修改batch_forward函数，去掉with autocast，可以运行，但是出现数据类型问题： ch/nn/modules/conv.py", line 370, in _conv_forward return F.conv1d( RuntimeError: Input type (float) and bias type (c10::BFloat16) should be the same

手动改了输入类型bf16,输出loss会有新的问题 RuntimeError: "ctc_loss_cuda" not implemented for 'BFloat16'

torch version: 2.6.0 deepspeed: 0.17.5

Oct 20 '25 10:10 yangyyt

deepspeed config 是什么

Oct 20 '25 16:10 Mddct

deepspeed config 是什么

用的代码中的conf/ds_stage1.json ， { "train_micro_batch_size_per_gpu": 1, "gradient_accumulation_steps": 1, "steps_per_print": 100, "gradient_clipping": 5, "fp16": { "enabled": false, "auto_cast": false, "loss_scale": 0, "initial_scale_power": 16, "loss_scale_window": 1000, "hysteresis": 2, "consecutive_hysteresis": false, "min_loss_scale": 1 }, "bf16": { "enabled": true }, "zero_force_ds_cpu_optimizer": false, "zero_optimization": { "stage": 1, "offload_optimizer": { "device": "none", "pin_memory": true }, "allgather_partitions": true, "allgather_bucket_size": 5e8, "overlap_comm": true, "reduce_scatter": true, "reduce_bucket_size": 5e8, "contiguous_gradients" : true } }

Oct 21 '25 01:10 yangyyt

解决了吗？遇到同样问题

Dec 08 '25 06:12 DeityBoom