Code stuck on "initalizing ddp" when using more than one gpu on neuralforecast AutoTFT, AutoNHITs
What happened + What you expected to happen
When running this notebook with multi-GPU
https://colab.research.google.com/github/Nixtla/neuralforecast/blob/main/nbs/examples/IntermittentData.ipynb
Code stuck on "initalizing ddp" - are there any parameters to control the number of GPU's utilized? Example if I have 10 GPUs but I only want to use one GPU for AutoTFT how could I do this without manually changing the low-level code? Could I define this globally in code or notebook?
Related to this issue here: https://github.com/Lightning-AI/lightning/issues/4612
Error occurs here: https://github.com/Nixtla/neuralforecast/blob/main/neuralforecast/losses/pytorch.py
Versions / Dependencies
neuralforecast==1.5.0
Reproduction script
nf.fit(df=Y_df)
Issue Severity
High: It blocks me from completing my task.
I have the same problem when training just with neuralforecast , without using ray.
There are 2 video cards and I prescribe in the network parameters:
strategy='ddp_notebook', # 'dp'|'ddp_notebook'/'ddp_spawn'
accelerator = 'gpu',
devices = [0,1],
, and constantly error You selected Trainer(strategy='') but process forking is not supported on this platform.
ValueError: You selected Trainer(strategy='ddp_notebook') but process forking is not supported on this platform. We recommed Trainer(strategy='ddp_spawn') instead.
ValueError: You selected Trainer(strategy='ddp_fork') but process forking is not supported on this platform. We recommed Trainer(strategy='ddp_spawn') instead.
MisconfigurationException: Trainer(strategy='ddp_spawn') is not compatible with an interactive environment. Run your code as a script, or choose one of the compatible strategies: Fabric(strategy='dp'|'ddp_notebook'). In case you are spawning processes yourself, make sure to include the Trainer creation inside the worker function.
Your error is more complicated
I have the same problem when training just with neuralforecast , without using ray. There are 2 video cards and I prescribe in the network parameters: strategy='ddp_notebook', # 'dp'|'ddp_notebook'/'ddp_spawn' accelerator = 'gpu', devices = [0,1], , and constantly error You selected
Trainer(strategy='')but process forking is not supported on this platform.ValueError: You selected
Trainer(strategy='ddp_notebook')but process forking is not supported on this platform. We recommedTrainer(strategy='ddp_spawn')instead.ValueError: You selected
Trainer(strategy='ddp_fork')but process forking is not supported on this platform. We recommedTrainer(strategy='ddp_spawn')instead.MisconfigurationException:
Trainer(strategy='ddp_spawn')is not compatible with an interactive environment. Run your code as a script, or choose one of the compatible strategies:Fabric(strategy='dp'|'ddp_notebook'). In case you are spawning processes yourself, make sure to include the Trainer creation inside the worker function.
Your error is more complicated it may be to do with your version Pytorch Lightning - this is not directly related to my issue. Either way it would be good to be able to directly change GPU options within the function parameters without changing low level code options.