darts
darts copied to clipboard
[BUG] Issue with historical_forcasts method
I had this problem with the historical_forcasts method in RNNModel and NBEATSModel. I get this " ValueError: conflicting sizes for dimension 'time': length 47 on the data but length 461 on coordinate 'time' ". i'm not sure but guess that this error happens at the last stride of the method. and by changinging the forcast_horizon and stride values, "47" and "461" values in the error will change. here is the Traceback:
ValueError Traceback (most recent call last)
Input In [29], in <cell line: 1>()
----> 1 hist_forcast = model.historical_forecasts(series=data_trans,
2 num_samples=20,
3 start=0.85,
4 forecast_horizon=5,
5 stride=10,
6 )
File ~\anaconda3\envs\torch39\lib\site-packages\darts\utils\utils.py:172, in _with_sanity_checks.<locals>.decorator.<locals>.sanitized_method(self, *args, **kwargs)
169 only_args.pop("self")
171 getattr(self, sanity_check_method)(*only_args.values(), **only_kwargs)
--> 172 return method_to_sanitize(self, *only_args.values(), **only_kwargs)
File ~\anaconda3\envs\torch39\lib\site-packages\darts\models\forecasting\forecasting_model.py:469, in ForecastingModel.historical_forecasts(self, series, past_covariates, future_covariates, num_samples, train_length, start, forecast_horizon, stride, retrain, overlap_end, last_points_only, verbose)
461 return TimeSeries.from_times_and_values(
462 pd.DatetimeIndex(last_points_times, freq=series.freq * stride),
463 np.array(last_points_values),
(...)
466 hierarchy=series.hierarchy,
467 )
468 else:
--> 469 return TimeSeries.from_times_and_values(
470 pd.RangeIndex(
471 start=last_points_times[0],
472 stop=last_points_times[-1] + 1,
473 step=1,
474 ),
475 np.array(last_points_values),
476 columns=series.columns,
477 static_covariates=series.static_covariates,
478 hierarchy=series.hierarchy,
479 )
481 return forecasts
File ~\anaconda3\envs\torch39\lib\site-packages\darts\timeseries.py:934, in TimeSeries.from_times_and_values(cls, times, values, fill_missing_dates, freq, columns, fillna_value, static_covariates, hierarchy)
931 if columns is not None:
932 coords[DIMS[1]] = columns
--> 934 xa = xr.DataArray(
935 values,
936 dims=(times_name,) + DIMS[-2:],
937 coords=coords,
938 attrs={STATIC_COV_TAG: static_covariates, HIERARCHY_TAG: hierarchy},
939 )
941 return cls.from_xarray(
942 xa=xa,
943 fill_missing_dates=fill_missing_dates,
944 freq=freq,
945 fillna_value=fillna_value,
946 )
File ~\anaconda3\envs\torch39\lib\site-packages\xarray\core\dataarray.py:412, in DataArray.__init__(self, data, coords, dims, name, attrs, indexes, fastpath)
410 data = _check_data_shape(data, coords, dims)
411 data = as_compatible_data(data)
--> 412 coords, dims = _infer_coords_and_dims(data.shape, coords, dims)
413 variable = Variable(dims, data, attrs, fastpath=True)
414 indexes, coords = _create_indexes_from_coords(coords)
File ~\anaconda3\envs\torch39\lib\site-packages\xarray\core\dataarray.py:160, in _infer_coords_and_dims(shape, coords, dims)
158 for d, s in zip(v.dims, v.shape):
159 if s != sizes[d]:
--> 160 raise ValueError(
161 f"conflicting sizes for dimension {d!r}: "
162 f"length {sizes[d]} on the data but length {s} on "
163 f"coordinate {k!r}"
164 )
166 if k in sizes and v.shape != (sizes[k],):
167 raise ValueError(
168 f"coordinate {k!r} is a DataArray dimension, but "
169 f"it has shape {v.shape!r} rather than expected shape {sizes[k]!r} "
170 "matching the dimension size"
171 )
ValueError: conflicting sizes for dimension 'time': length 47 on the data but length 461 on coordinate 'time'
To Reproduce here is the code I wrote in jupyter:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import yfinance as yf
from darts.timeseries import TimeSeries
from darts.models import NBEATSModel
from darts.utils.likelihood_models import GaussianLikelihood
from darts.dataprocessing.transformers import Scaler
from darts.metrics import mae, mape
from pytorch_lightning.callbacks import EarlyStopping, ModelCheckpoint
df = yf.download("GOOG", start="2010-01-01", end="2022-06-30", period="1d", progress=False)
df = df.reset_index()
data = TimeSeries.from_dataframe(df, value_cols=["Close"], fill_missing_dates=False)
data = data.astype(np.float32)
scalar_ts = Scaler()
data_trans = scalar_ts.fit_transform(data)
train, val = data_trans.split_after(0.85)
model_checkpoint_callback = ModelCheckpoint(filename="best_darts_gru_model_cb",
save_top_k=1,
monitor='train_loss',
mode='min')
early_stopping_callback = EarlyStopping(monitor='train_loss',
patience=10,
mode='min',
check_finite=False)
model = RNNModel(input_chunk_length=7,
model="GRU",
hidden_dim=50,
n_rnn_layers=2,
n_epochs=500,
training_length=7,
likelihood=GaussianLikelihood(),
model_name="GRU_Darts",
work_dir="optimization",
save_checkpoints=True,
pl_trainer_kwargs={"callbacks":[model_checkpoint_callback, early_stopping_callback],
"accelerator": "gpu",
"gpus": [0]
},
force_reset=True)
model.fit(train, val_series=val)
hist_forcast = model.historical_forecasts(series=data_trans,
num_samples=20,
start=0.85,
forecast_horizon=5,
stride=10,
)
System (please complete the following information):
- Python version: 3.9.13
- darts version: 0.20.0
Hi, I used "stride=1" this time and there was no error. the method ran completely and no error occurred. it seems that the error is for stride>1.
Hi all, I'm running into the same error :)
Thanks for reporting, we'll have to have a look
Hi,
I tried to replicate your issue with the latest darts version (0.22.0) and could not reproduce it. Could you please try to update the library and see if the problem still occurs? Thank you.
Closing - feel free to re-open if the issue is spotted again.