darts icon indicating copy to clipboard operation
darts copied to clipboard

[BUG] prediction after Multi GPU training throws assertion error

Open Nishanth009 opened this issue 1 year ago • 1 comments

Hi, I was trying to train a RNN model using multi gpu. I have followed instructions from website using as follows for enabling multi gpu pl_trainer_kwargs = {"accelerator": "gpu", "devices": -1, "auto_select_gpus": True}

The training is successful but while predicting I am getting assertion error

/site-packages/pytorch_lightning/overrides/distributed.py", line 213, in init assert self.num_samples >= 1 or self.total_size == 0

Having this issue we are not able to Darts for multi gpu training please some one let me know what change is required to generate the predictions when multiple gpu is enabled. Thanks

Nishanth009 avatar Aug 24 '23 10:08 Nishanth009

Hi @Nishanth009,

I think that this is a duplicate of #1945, but sadly, we don't have the hardware to reproduce it at the moment.

You could please share a code snipped (with a dummy dataset) mimicking your approach to help us reproduce the bug?

madtoinou avatar Aug 25 '23 07:08 madtoinou

Closing this, since it seems to be a duplicate of both #1945 and #2265 where a solution as been suggested.

Please let us know if BasePredictionWriter solves the problem in one of the issue.

madtoinou avatar Mar 05 '24 13:03 madtoinou