pytorch-forecasting icon indicating copy to clipboard operation
pytorch-forecasting copied to clipboard

Can't execute sample code of N-HiTS tutorial

Open chrischang80 opened this issue 3 years ago • 0 comments

  • PyTorch-Forecasting version: 0.10.1
  • PyTorch version: 1.11.0+cu113
  • Python version: 3.7.13
  • Operating System:

Expected behavior

I tried to execute the sample code of N-HiTS tutorial on Colab. Due to the error of import 'MQF2DistributionLoss' (it hasn't added in pytorch-forecasting v0.10.1), I copied the needed code snippet from github to my colab notebook. Then, everything is fine until the finding optimal learning rate stage. I always got the following error, any idea? Thanks!!!

Actual behavior

IndexError                                Traceback (most recent call last)
[<ipython-input-243-9471f426b350>](https://localhost:8080/#) in <module>()
      1 # find optimal learning rate
----> 2 res = trainer.tuner.lr_find(net, train_dataloaders=train_dataloader, val_dataloaders=val_dataloader, min_lr=1e-5)
      3 print(f"suggested learning rate: {res.suggestion()}")
      4 fig = res.plot(show=True, suggest=True)
      5 fig.show()

23 frames
[/usr/local/lib/python3.7/dist-packages/pytorch_lightning/tuner/tuning.py](https://localhost:8080/#) in lr_find(self, model, train_dataloaders, val_dataloaders, datamodule, min_lr, max_lr, num_training, mode, early_stop_threshold, update_attr)
    201                 "mode": mode,
    202                 "early_stop_threshold": early_stop_threshold,
--> 203                 "update_attr": update_attr,
    204             },
    205         )

[/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py](https://localhost:8080/#) in tune(self, model, train_dataloaders, val_dataloaders, datamodule, scale_batch_size_kwargs, lr_find_kwargs)
   1125         with isolate_rng():
   1126             result = self.tuner._tune(
-> 1127                 model, scale_batch_size_kwargs=scale_batch_size_kwargs, lr_find_kwargs=lr_find_kwargs
   1128             )
   1129 

[/usr/local/lib/python3.7/dist-packages/pytorch_lightning/tuner/tuning.py](https://localhost:8080/#) in _tune(self, model, scale_batch_size_kwargs, lr_find_kwargs)
     61         if self.trainer.auto_lr_find:
     62             lr_find_kwargs.setdefault("update_attr", True)
---> 63             result["lr_find"] = lr_find(self.trainer, model, **lr_find_kwargs)
     64 
     65         self.trainer.state.status = TrainerStatus.FINISHED

[/usr/local/lib/python3.7/dist-packages/pytorch_lightning/tuner/lr_finder.py](https://localhost:8080/#) in lr_find(trainer, model, min_lr, max_lr, num_training, mode, early_stop_threshold, update_attr)
    222 
    223     # Fit, lr & loss logged in callback
--> 224     trainer.tuner._run(model)
    225 
    226     # Prompt if we stopped early

[/usr/local/lib/python3.7/dist-packages/pytorch_lightning/tuner/tuning.py](https://localhost:8080/#) in _run(self, *args, **kwargs)
     71         self.trainer.state.status = TrainerStatus.RUNNING  # last `_run` call might have set it to `FINISHED`
     72         self.trainer.training = True
---> 73         self.trainer._run(*args, **kwargs)
     74         self.trainer.tuning = True
     75 

[/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py](https://localhost:8080/#) in _run(self, model, ckpt_path)
   1232         self._checkpoint_connector.resume_end()
   1233 
-> 1234         results = self._run_stage()
   1235 
   1236         log.detail(f"{self.__class__.__name__}: trainer tearing down")

[/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py](https://localhost:8080/#) in _run_stage(self)
   1319         if self.predicting:
   1320             return self._run_predict()
-> 1321         return self._run_train()
   1322 
   1323     def _pre_training_routine(self):

[/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py](https://localhost:8080/#) in _run_train(self)
   1341 
   1342         with isolate_rng():
-> 1343             self._run_sanity_check()
   1344 
   1345         # enable train mode

[/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py](https://localhost:8080/#) in _run_sanity_check(self)
   1409             # run eval step
   1410             with torch.no_grad():
-> 1411                 val_loop.run()
   1412 
   1413             self._call_callback_hooks("on_sanity_check_end")

[/usr/local/lib/python3.7/dist-packages/pytorch_lightning/loops/base.py](https://localhost:8080/#) in run(self, *args, **kwargs)
    202             try:
    203                 self.on_advance_start(*args, **kwargs)
--> 204                 self.advance(*args, **kwargs)
    205                 self.on_advance_end()
    206                 self._restarting = False

[/usr/local/lib/python3.7/dist-packages/pytorch_lightning/loops/dataloader/evaluation_loop.py](https://localhost:8080/#) in advance(self, *args, **kwargs)
    152         if self.num_dataloaders > 1:
    153             kwargs["dataloader_idx"] = dataloader_idx
--> 154         dl_outputs = self.epoch_loop.run(self._data_fetcher, dl_max_batches, kwargs)
    155 
    156         # store batch level output per dataloader

[/usr/local/lib/python3.7/dist-packages/pytorch_lightning/loops/base.py](https://localhost:8080/#) in run(self, *args, **kwargs)
    202             try:
    203                 self.on_advance_start(*args, **kwargs)
--> 204                 self.advance(*args, **kwargs)
    205                 self.on_advance_end()
    206                 self._restarting = False

[/usr/local/lib/python3.7/dist-packages/pytorch_lightning/loops/epoch/evaluation_epoch_loop.py](https://localhost:8080/#) in advance(self, data_fetcher, dl_max_batches, kwargs)
    125 
    126         # lightning module methods
--> 127         output = self._evaluation_step(**kwargs)
    128         output = self._evaluation_step_end(output)
    129 

[/usr/local/lib/python3.7/dist-packages/pytorch_lightning/loops/epoch/evaluation_epoch_loop.py](https://localhost:8080/#) in _evaluation_step(self, **kwargs)
    220             output = self.trainer._call_strategy_hook("test_step", *kwargs.values())
    221         else:
--> 222             output = self.trainer._call_strategy_hook("validation_step", *kwargs.values())
    223 
    224         return output

[/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py](https://localhost:8080/#) in _call_strategy_hook(self, hook_name, *args, **kwargs)
   1761 
   1762         with self.profiler.profile(f"[Strategy]{self.strategy.__class__.__name__}.{hook_name}"):
-> 1763             output = fn(*args, **kwargs)
   1764 
   1765         # restore current_fx when nested context

[/usr/local/lib/python3.7/dist-packages/pytorch_lightning/strategies/strategy.py](https://localhost:8080/#) in validation_step(self, *args, **kwargs)
    342         """
    343         with self.precision_plugin.val_step_context():
--> 344             return self.model.validation_step(*args, **kwargs)
    345 
    346     def test_step(self, *args, **kwargs) -> Optional[STEP_OUTPUT]:

[/usr/local/lib/python3.7/dist-packages/pytorch_forecasting/models/base_model.py](https://localhost:8080/#) in validation_step(self, batch, batch_idx)
    411     def validation_step(self, batch, batch_idx):
    412         x, y = batch
--> 413         log, out = self.step(x, y, batch_idx)
    414         log.update(self.create_log(x, y, out, batch_idx))
    415         return log

[/usr/local/lib/python3.7/dist-packages/pytorch_forecasting/models/nhits/__init__.py](https://localhost:8080/#) in step(self, x, y, batch_idx)
    338         Take training / validation step.
    339         """
--> 340         log, out = super().step(x, y, batch_idx=batch_idx)
    341 
    342         if self.hparams.backcast_loss_ratio > 0:  # add loss from backcast

[/usr/local/lib/python3.7/dist-packages/pytorch_forecasting/models/base_model.py](https://localhost:8080/#) in step(self, x, y, batch_idx, **kwargs)
    553                 loss = self.loss(prediction, y, **mase_kwargs)
    554             else:
--> 555                 loss = self.loss(prediction, y)
    556 
    557         self.log(

[/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py](https://localhost:8080/#) in _call_impl(self, *input, **kwargs)
   1108         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1109                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1110             return forward_call(*input, **kwargs)
   1111         # Do not call functions when jit is used
   1112         full_backward_hooks, non_full_backward_hooks = [], []

[/usr/local/lib/python3.7/dist-packages/pytorch_forecasting/metrics.py](https://localhost:8080/#) in forward(self, y_pred, target, **kwargs)
    196         # need this explicitly to avoid backpropagation errors because of sketchy caching
    197         y_pred_flattened, target_flattened = self._convert(y_pred, target)
--> 198         return self.torchmetric.forward(y_pred_flattened, target_flattened, **kwargs)
    199 
    200     def compute(self):

[/usr/local/lib/python3.7/dist-packages/torchmetrics/metric.py](https://localhost:8080/#) in forward(self, *args, **kwargs)
    246 
    247         # global accumulation
--> 248         self.update(*args, **kwargs)
    249 
    250         self._to_sync = self.dist_sync_on_step  # type: ignore

[/usr/local/lib/python3.7/dist-packages/torchmetrics/metric.py](https://localhost:8080/#) in wrapped_func(*args, **kwargs)
    310             self._update_called = True
    311             with torch.set_grad_enabled(self._enable_grad):
--> 312                 update(*args, **kwargs)
    313             if self.compute_on_cpu:
    314                 self._move_list_states_to_cpu()

[<ipython-input-234-c4caa05ac8fd>](https://localhost:8080/#) in update(self, y_pred, target)
     42             target, lengths = unpack_sequence(target)
     43         else:
---> 44             lengths = torch.full((target.size(0),), fill_value=target.size(1), dtype=torch.long, device=target.device)
     45 
     46         losses = self.loss(y_pred, target)

IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1)

### Code to reproduce the problem

        # find optimal learning rate
---->res = trainer.tuner.lr_find(net, train_dataloaders=train_dataloader, val_dataloaders=val_dataloader, 
        min_lr=1e-5)
        print(f"suggested learning rate: {res.suggestion()}")
        fig = res.plot(show=True, suggest=True)
        fig.show()
        net.hparams.learning_rate = res.suggestion()

Paste the command(s) you ran and the output. Including a link to a colab notebook will speed up issue resolution. If there was a crash, please include the traceback here. The code used to initialize the TimeSeriesDataSet and model should be also included.

chrischang80 avatar May 25 '22 04:05 chrischang80