etna
etna copied to clipboard
[BUG] Forecast with predictive intervals fails in case of data gaps
🐛 Bug Report
I run code from How to reproduce
section with PREDICTION_INTERVAL = False
and everything works good. If I run it with PREDICTION_INTERVAL = True
it fails with unclear error ValueError: y_true and y_pred have different timestamps
.
This error is caused by metrics computation in backtest (from prediction intervals generation). Actually, we do not need to compute metrics to build prediction intervals 🤔
Expected behavior
As long as we don't need to compute metrics to build prediction intervals code shouldn't fail in case of metrics' errors.
How To Reproduce
Code
import pandas as pd
from etna.models import NaiveModel
from etna.pipeline import Pipeline
from etna.transforms import TimeSeriesImputerTransform
from etna.datasets import TSDataset
PREDICTION_INTERVAL = True
N = 50
HORIZON = 7
df = pd.DataFrame(
{
"timestamp": pd.date_range("2020-01-01", periods=N),
"segment": "segment_0",
"target": list(range(N))
}
)
df.loc[40, "target"] = None
df.loc[10, "target"] = None
df.loc[25, "target"] = None
wide_df = TSDataset.to_dataset(df)
ts = TSDataset(wide_df, "D")
train, test = ts.train_test_split(test_size=HORIZON)
pipeline = Pipeline(model=NaiveModel(), transforms=[TimeSeriesImputerTransform()], horizon=HORIZON)
pipeline.fit(train)
forecast = pipeline.forecast(prediction_interval=PREDICTION_INTERVAL, n_folds=3)
code above returns an error ValueError: y_true and y_pred have different timestamps
.
Environment
No response
Additional context
No response
Checklist
- [X] Bug appears at the latest library version