Bug of `setup` for `SingleDeviceStrategy` with `LightningLite`
First check
- [X] I'm sure this is a bug.
- [X] I've added a descriptive title to this bug.
- [X] I've provided clear instructions on how to reproduce the bug.
- [X] I've added a code sample.
- [X] I've provided any other important info that is required.
Bug description
Hi, there! I found a bug for SingleDeviceStrategy with LightningLite: when I use setup to set model device, it's expected that the device of the model is same with the device of strategy, but it's not. Please check the following code link to reproduce the bug.
How to reproduce the bug
Error messages and logs
# Error messages and logs here please
lite = EmptyLite(accelerator="auto", strategy=None, devices='0,')
model = nn.Linear(1, 2)
lite_model = lite.setup(model)
print(lite._strategy.__class__.__name__) # SingleDeviceStrategy
print(lite.device, lite_model.device) # cuda:0 cpu (!!! unexpected)
Important info
#- Lightning Component (e.g. Trainer, LightningModule, LightningApp, LightningWork, LightningFlow):
#- PyTorch Lightning Version (e.g., 1.5.0): 1.7.6
#- Lightning App Version (e.g., 0.5.2): NA
#- PyTorch Version (e.g., 1.10): 1.12.1+cu113
#- Python version (e.g., 3.9): 3.7
#- OS (e.g., Linux): Linux
#- CUDA/cuDNN version: 11.3
#- GPU models and configuration: NA
#- How you installed Lightning(`conda`, `pip`, source): pip
#- Running environment of LightningApp (e.g. local, cloud): local
More info
No response
@JinchaoLove Thanks for trying Lite and reporting this issue! I found the problem already.
Don't worry, the model parameters are all on the correct device. You should be able to train your model on the GPU without problem. It is just that the wrapper's .device property has not correctly updated it's value. I'm preparing a fix for this.
Probably a duplicate of https://github.com/Lightning-AI/lightning/issues/13108 but for Lite
@carmocca It is not a duplicate of #13108. What is observed here is a limitation of DeviceDtypeModuleMixin, which cannot know the initial device of a module and assumes it to be on CPU.
Dear all, thanks for the efficient⚡️ reply. Exactly, this issue not affects the running, thanks!
@JinchaoLove Thanks for trying Lite and reporting this issue! I found the problem already. Don't worry, the model parameters are all on the correct device. You should be able to train your model on the GPU without problem. It is just that the wrapper's
.deviceproperty has not correctly updated it's value. I'm preparing a fix for this.
Thanks @JinchaoLove Glad to hear that and happy to help