pytorch-forecasting icon indicating copy to clipboard operation
pytorch-forecasting copied to clipboard

RuntimeError: Sizes of tensors must match except in dimension 1

Open MathiasHolmstrom opened this issue 1 year ago • 16 comments

  • PyTorch-Forecasting version: 1.0.0
  • PyTorch version: 2.0.1+cpu
  • Python version: 3.9
  • Operating System: Windows11

Expected behavior

I executed code Baseline().predict(val_dataloader, return_y=True) and did not expect any errors

Actual behavior

Received the following error

    return torch.cat(sequences, dim=1)
RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 1280 but got size 42 for tensor number 14 in the list.


Code to reproduce the problem

I am running the following code on an internal dataset

max_prediction_length = 6
max_encoder_length = 24
training_cutoff = data["time_idx"].max() - max_prediction_length

training = TimeSeriesDataSet(
    data[data['time_idx'] <= training_cutoff],
    group_ids=["product_number", "sku_size", "retail_sales_channel"],
    time_idx="time_idx",
    target="quantity_sold",
    min_prediction_length=1,
    time_varying_known_reals=["time_idx", "discount_rate"],
    time_varying_unknown_categoricals=[],
    time_varying_unknown_reals=[
        "quantity_physical_closing",
    ], 
    add_relative_time_idx=True,
    add_target_scales=True,
    add_encoder_length=True,
)
validation = TimeSeriesDataSet.from_dataset(training, data, predict=True, stop_randomization=True)
batch_size = 128  # set this between 32 to 128
train_dataloader = training.to_dataloader(train=True, batch_size=batch_size, num_workers=0)
val_dataloader = validation.to_dataloader(train=False, batch_size=batch_size * 10, num_workers=0)
baseline_predictions = Baseline().predict(val_dataloader, return_y=True)

MathiasHolmstrom avatar Jun 02 '23 14:06 MathiasHolmstrom

I didn't run the code, but I know len(trainning) % 128 == 42 or len(trainning) % 1280 == 42

Their code is funny.

ntlm1686 avatar Jun 03 '23 06:06 ntlm1686

So do you know what I can change to make it work?

MathiasHolmstrom avatar Jun 06 '23 07:06 MathiasHolmstrom

Just make the length of training integer multiple of the batch size.

For example, your batch size is 64. Training length is 6420. Then drop the last 20 samples.

ntlm1686 avatar Jun 06 '23 15:06 ntlm1686

It's the validation data that fails so I assume I should drop it based on validation set? Although I tried both and neither works.

MathiasHolmstrom avatar Jun 07 '23 14:06 MathiasHolmstrom

I am currently faced with similar issue even when I tried to evaluate the performance of the tft model.

predictions = best_tft.predict(val_dataloader, return_y=True, trainer_kwargs=dict(accelerator="cpu")) MAE()(predictions.output, predictions.y)

Please, if you find a way around yours, let me know how

adejumobioluwafemi avatar Jun 21 '23 10:06 adejumobioluwafemi

I'm having the same issue with pretty much the same code :/

hippotilt avatar Jul 05 '23 12:07 hippotilt

Yes, the code in question (which produces this error) is in the TFT demand example in the documentation.

neverfox avatar Jul 05 '23 14:07 neverfox

I've found a fix : modifying the concat_sequences() function in utils.py: it just pads the last sequence tensor with nans so that its size matches that of the other. I'm not sure how reliable this is, but with this my code runs.

def concat_sequences(
      sequences: Union[List[torch.Tensor], List[rnn.PackedSequence]]
  ) -> Union[torch.Tensor, rnn.PackedSequence]:
      """
      Concatenate RNN sequences.
      Args:
          sequences (Union[List[torch.Tensor], List[rnn.PackedSequence]): list of RNN packed sequences or tensors of which
              first index are samples and second are timesteps
  
      Returns:
          Union[torch.Tensor, rnn.PackedSequence]: concatenated sequence
      """
      if isinstance(sequences[0], rnn.PackedSequence):
          return rnn.pack_sequence(sequences, enforce_sorted=False)
      elif isinstance(sequences[0], torch.Tensor):
          # BEGINING OF MODIFIED CODE
          #print("Sequence size : ")
          #print(sequences[0].size(), sequences[-1].size())
          if sequences[0].size(0) > sequences[-1].size(0):
              #print("Padding")
              delta = sequences[0].size(0) - sequences[-1].size(0)
              #print(sequences[-1].size())
              sequences[-1] = F.pad(sequences[-1],pad=(0,0,0,delta),mode="constant",value=torch.nan)
              #print(sequences[-1].size())
          # END OF MODIFIED CODE
          return torch.cat(sequences, dim=1)
      elif isinstance(sequences[0], (tuple, list)):
          return tuple(
              concat_sequences([sequences[ii][i] for ii in range(len(sequences))]) for i in range(len(sequences[0]))
          )
      else:
          raise ValueError("Unsupported sequence type")

hippotilt avatar Jul 06 '23 14:07 hippotilt

I've been struggling with a similar problem for a long time now. What worked for me (I don't know if it makes mathematical sense) was to lower the batch size to the size that the error tells you. In your case 42.

Hope this helps

DaniloMendezR avatar Jul 14 '23 06:07 DaniloMendezR

Please see my comment here - https://github.com/jdb78/pytorch-forecasting/issues/449#issuecomment-1649288069.

If you don't need the ys (it's easy to format them yourself), then setting return_y = False fixes the issue.

@hippotilt thanks! I tracked down the problem to this function. It would be nice if something similar was merged upstream so that we don't need to hack it in our own code.

abudis avatar Jul 25 '23 07:07 abudis

I encountered the same error and narrowed down the issue, as mentioned by many above, to the concat_sequences function in utils.py. The following fix worked for me:

def concat_sequences(
    sequences: Union[List[torch.Tensor], List[rnn.PackedSequence]]
) -> Union[torch.Tensor, rnn.PackedSequence]:
    """
    Concatenate RNN sequences.

    Args:
        sequences (Union[List[torch.Tensor], List[rnn.PackedSequence]): list of RNN packed sequences or tensors of which
            first index are samples and second are timesteps

    Returns:
        Union[torch.Tensor, rnn.PackedSequence]: concatenated sequence
    """
    if isinstance(sequences[0], rnn.PackedSequence):
        return rnn.pack_sequence(sequences, enforce_sorted=False)
    elif isinstance(sequences[0], torch.Tensor):
        return torch.cat(sequences, dim=0)  # changed from dim=1 to dim=0
    elif isinstance(sequences[0], (tuple, list)):
        return tuple(
            concat_sequences([sequences[ii][i] for ii in range(len(sequences))]) for i in range(len(sequences[0]))
        )
    else:
        raise ValueError("Unsupported sequence type")

Just changing the concat dimension to 0 (the axis containing the batches) fixes the error. I am not sure how this function is used elsewhere in the package and hope it does not break things in those places.

Meet1995 avatar Nov 12 '23 01:11 Meet1995

I am currently faced with similar issue even when I tried to evaluate the performance of the tft model.

predictions = best_tft.predict(val_dataloader, return_y=True, trainer_kwargs=dict(accelerator="cpu")) MAE()(predictions.output, predictions.y)

Please, if you find a way around yours, let me know how

Same issue here, can't predict all my examples because they aren't a multiplier of batch_size. Would be great if we can have a fix on this one.

deltawi avatar Jan 15 '24 05:01 deltawi