etna icon indicating copy to clipboard operation
etna copied to clipboard

[BUG] Forecast with predictive intervals fails in case of data gaps

Open julia-shenshina opened this issue 1 year ago • 1 comments

🐛 Bug Report

I run code from How to reproduce section with PREDICTION_INTERVAL = False and everything works good. If I run it with PREDICTION_INTERVAL = True it fails with unclear error ValueError: y_true and y_pred have different timestamps.

This error is caused by metrics computation in backtest (from prediction intervals generation). Actually, we do not need to compute metrics to build prediction intervals 🤔

Expected behavior

As long as we don't need to compute metrics to build prediction intervals code shouldn't fail in case of metrics' errors.

How To Reproduce

Code

import pandas as pd
from etna.models import NaiveModel
from etna.pipeline import Pipeline
from etna.transforms import TimeSeriesImputerTransform
from etna.datasets import TSDataset

PREDICTION_INTERVAL = True

N = 50
HORIZON = 7

df = pd.DataFrame(
    {
        "timestamp": pd.date_range("2020-01-01", periods=N),
        "segment": "segment_0",
        "target": list(range(N))
    }
)

df.loc[40, "target"] = None
df.loc[10, "target"] = None
df.loc[25, "target"] = None

wide_df = TSDataset.to_dataset(df)
ts = TSDataset(wide_df, "D")
train, test = ts.train_test_split(test_size=HORIZON)

pipeline = Pipeline(model=NaiveModel(), transforms=[TimeSeriesImputerTransform()], horizon=HORIZON)
pipeline.fit(train)
forecast = pipeline.forecast(prediction_interval=PREDICTION_INTERVAL, n_folds=3)

code above returns an error ValueError: y_true and y_pred have different timestamps.

Environment

No response

Additional context

No response

Checklist

  • [X] Bug appears at the latest library version

julia-shenshina avatar Sep 01 '22 12:09 julia-shenshina

Possible hotfix: create dummy metric that does not check timestamps for y_true and y_pred and change MAE metric with it here. It can cause some problems here, so we need to add a test to check the intervals still work in case of data gaps.

julia-shenshina avatar Sep 02 '22 08:09 julia-shenshina