ColossalAI
ColossalAI copied to clipboard
[BUG]: Some unsupported torch function is operated upon this parameter
🐛 Describe the bug
When I train stable diffusion and set cond_stage_trainable to True, I encounter this error: Some unsupported torch function is operated upon this parameter. Is it only supported to update unet weights?
Environment
CUDA11.2 Python3.7.10 Pytorch1.13.1
Please update lightning and colossalai to the latest version.
not work
The same error occurs when I try to change the code to forward unet model twice. Have you figured out how to fix it?
We have updated a lot. Please check the latest code. This issue was closed due to inactivity. Thanks.
I also encountered the same error. Did you manage to resolve it later on?
I also encountered the same error. Did you manage to resolve it later on?
I'm using version 0.3.0 of colossalai. I encountered a RuntimeError: Parameter "tor_bond_conv.batch_norm.bias" failed at the gradient reduction. Some unsupported torch function is operated upon this parameter.
error. In the gemini_plugin.py
file, I found a comment mentioning that the support for zero
in colossalai is currently not optimal, along with the commented line model = nn.SyncBatchNorm.convert_sync_batchnorm(model, None)
. I suspected that the issue was caused by the Batch Normalization layers in the model. However, even after uncommenting that line, the error still persists and remains unchanged.