sktime icon indicating copy to clipboard operation
sktime copied to clipboard

[ENH] add a method or property in ForecastX to access X predictions

Open yarnabrina opened this issue 2 years ago • 7 comments

During prediction using ForecastX, it first uses the forecaster for X to get future values of all X, possibly combining user supplied values of a subset of columns and forecaster_X predicted values for its complementary subset. This is then fed into forecaster_y for final prediction of y.

I suggest we add a public method to access these X predictions that is the final input to the model.

Also, as of now the method will always lead to a specific fixed result as fh for ForecastX cannot be varied. So, this can even be stored in a property (functools.cached_property perhaps) and exposed that way to users.

yarnabrina avatar Dec 03 '23 05:12 yarnabrina

Hm, interesting idea.

I see this connected to two discussions:

  • the idea to have access to intermediate pipeline results in general, and the graph pipeline - FYI @benHeid
  • the discussion that we should actually store less - not more - data as attributes of forecasters: https://github.com/sktime/sktime/issues/5589, FYI @corradomio

So the two related questions are:

  • instead of making the public method specific to ForecastX, should we have sth more general?
  • should we store this as an attribute, or do sth more case-specific, where the user has to turn it on or request it? Simplest option is a param that turns it on, default off.

fkiraly avatar Dec 03 '23 14:12 fkiraly

If we just convert the existing private method _get_forecaster_X_prediction (private in the sense of starting with _ and not part of public documentation) to a public one, it is not affecting storing any more or less data. So, I'd argue probably the second point is not an issue, and user gets the predictions if only if they call the method specifically themselves without need of a separate parameter.

I'm not clear about the first point, if you meant that methods/attributes of an element of a pipeline is not exposed directly from pipeline (not sure if it's the case - just guessing), that is something that'll be very useful I guess.

yarnabrina avatar Dec 04 '23 15:12 yarnabrina

I think one idea of accessing the intermediate results could be the callbacks: https://github.com/sktime/sktime/issues/4814

benHeid avatar Dec 05 '23 08:12 benHeid

If we just convert the existing private method _get_forecaster_X_prediction (private in the sense of starting with _ and not part of public documentation) to a public one, it is not affecting storing any more or less data.

For future reference, this is insufficient as @benHeid pointed out in community call yesterday. Making this method public will make the functionality work when one uses ForecastX directly, but if it is nested in any other estimator, that method will not be directly exposed to users.

yarnabrina avatar Feb 10 '24 06:02 yarnabrina

hm, if it uses the "fitted params" interface, e.g., gives access through an attribute or property ending in underscore, it is retrievable via get_fitted_params, even if nested

fkiraly avatar Feb 10 '24 13:02 fkiraly

I am not sure if we want to store that as a property. I don't know what all gets stored during serialisation, but this may be quite a big thing depending on number of exogeneous features.

yarnabrina avatar Feb 10 '24 16:02 yarnabrina

Hm, you are right.

I was thinking about "on-demand properties" only, some options:

  • rework get_fitted_params so some properties are available only if directly requested - ether the function itself, or some trick that avoids serialization trouble
  • add a parameter only to ForecastX that controls whether the reference is stored

fkiraly avatar Feb 10 '24 18:02 fkiraly