pytorch-forecasting icon indicating copy to clipboard operation
pytorch-forecasting copied to clipboard

MultivariateNormalDistributionLoss returns Negative Value

Open nonconvexopt opened this issue 1 year ago • 0 comments

  • PyTorch-Forecasting version: 1.0.0.post123+c2320fa
  • PyTorch version: 2.0.0
  • Python version: 3.10.13
  • Operating System: Ubuntu 18.04

Expected behavior

I ran the DeepAR and DeepVAR sample code at https://pytorch-forecasting.readthedocs.io/en/stable/tutorials/deepar.html with my own asset price data. The model should output positive value.

Actual behavior

However, the loss output during stayed negative And thus learning rate finding is failed.

Code to reproduce the problem

data = pd.read_csv('data/asset_price.csv')
data = data.set_index('date_')
data.index = pd.DatetimeIndex(data.index)
data = data.resample('D').ffill().fillna(0)
data = data.loc[:, data.columns!='spgsci index.1']
original_data = data
data = data.pct_change(1)[1:]
data.replace(np.inf, 0, inplace=True)
data.fillna(0, inplace=True)
data = (data + 1).apply(np.log)
data = data.stack().reset_index()
data.columns = ['date', 'item', 'value']
data['item'] = data['item'].astype('category')
data['time_idx'] = (data['date'] - data['date'].min()).dt.days

assets = data['item'].unique().sort_values()

prediction_length = 30
encoder_length = 30

date = data["date"].max()
valid_end = date - datetime.timedelta(days=prediction_length)
train_end = valid_end - datetime.timedelta(days=prediction_length)

training = TimeSeriesDataSet(
    data[lambda x: x.date <= train_end],
    time_idx="time_idx",
    target="value",
    group_ids=["item"],
    min_encoder_length=encoder_length // 2,  # keep encoder length long (as it is in the validation set)
    max_encoder_length=encoder_length,
    min_prediction_length=30,
    max_prediction_length=prediction_length,
    static_categoricals=["item",],
    static_reals=[],
    time_varying_known_categoricals=[],
    variable_groups={},  # group of categorical variables can be treated as one variable
    time_varying_known_reals=["time_idx"],
    time_varying_unknown_categoricals=[],
    time_varying_unknown_reals=['value'],
    target_normalizer=GroupNormalizer(
        groups=["item"], #transformation="softplus"
    ),  # use softplus and normalize by group
    add_relative_time_idx=True,
    add_target_scales=True,
    add_encoder_length=True,
)

# create validation set (predict=True) which means to predict the last max_prediction_length points in time
# for each series

validation = TimeSeriesDataSet.from_dataset(training, data[lambda x: x.date <= valid_end], predict=True, stop_randomization=True)
test = TimeSeriesDataSet.from_dataset(validation, data, predict=True, stop_randomization=True)

# create dataloaders for model
batch_size = 16  # set this between 32 to 128
train_dataloader = training.to_dataloader(train=True, batch_size=batch_size, num_workers=32, batch_sampler="synchronized")
val_dataloader = validation.to_dataloader(train=False, batch_size=batch_size, num_workers=32, batch_sampler="synchronized")
test_dataloader = test.to_dataloader(train=False, batch_size=batch_size, num_workers=32, batch_sampler="synchronized")

pl.seed_everything(42)

trainer = pl.Trainer(accelerator="gpu", gradient_clip_val=1e-1)
net = DeepAR.from_dataset(
    training,
    learning_rate=1e-4,
    hidden_size=30,
    rnn_layers=2,
    loss=MultivariateNormalDistributionLoss(rank=16, sigma_init=1., sigma_minimum=0.),
    optimizer="Adam",
)

# find optimal learning rate
from lightning.pytorch.tuner import Tuner
​
res = Tuner(trainer).lr_find(
    net,
    train_dataloaders=train_dataloader,
    val_dataloaders=val_dataloader,
    min_lr=1e-5,
    max_lr=1e0,
    early_stop_threshold=100,
)
print(f"suggested learning rate: {res.suggestion()}")
fig = res.plot(show=True, suggest=True)
fig.show()
net.hparams.learning_rate = res.suggestion()

nonconvexopt avatar Nov 29 '23 08:11 nonconvexopt