darts Extend backtesting methods to retrain after configurable number of time steps

See: https://github.com/unit8co/darts/issues/135#issuecomment-709979882

Dec 01 '21 12:12 hrzn

Hi, I would like to contribute to this issue. The solution I have in mind is very straightforward with minimal impact I think. The main method impacted will be historical_forecasts in the ForecastingModel base class, basically I would change method signature to: retrain: int = 1 (actual behaviour) instead of bool. Then, before the prediction loop at line 430, I would define a counter=0 variable which will get increment at each iteration. Finally I would change the train if at line 438 to something like if (retrain or not self._fit_called) and (counter % retrain == 0), and this should do it. Let me know what you think, if you think this can work I can make a pr soon. Thanks!

Jul 14 '22 09:07 chefPony

Hi @chefPony, thanks for your suggestion. That looks very good to me! Feel free to go ahead and open a PR (please check the guidelines before). Many thanks!

Jul 14 '22 15:07 hrzn

Hi @hrzn, I would like to propose a more general approach, which I find particularly useful for retraining. As a naive example to justify this, suppose that you want to retrain the model every 1st of the month, then the above approach wouldn't work.

The way to do that is to pass a callable to determine on which conditions you should retrain the model, leaving the user all the flexibility, i.e. something like:

if (not self._fit_called 
    or (retrain 
         and retrain_on_condition(pred_time, train_series, past_covariates, future_covariates))
    ):
    # retrain the model here

Where the function retrain_on_condition should be provided by the user, here a list of few examples:

import pandas as pd
from darts import TimeSeries

def retrain_on_condition(
    pred_time: pd.Timestamp = None,
    train_series: TimeSeries = None,
    past_covariates: TimeSeries = None,
    future_covariates: TimeSeries = None
):
    """
    Examples:
        return pred_time.hour == 0  # retrain every midnight (for freq="H") 
        return pred_time.day == 1  # retrain every month start (for freq="D") 
        return (pred_time.hour == 0) & (pred_time.day == 1)  # retrain every month start (for freq="H")
        return (pred_time.hour == 0) & (pred_time.dayofweek == 0)  # retrain every week (for freq="H")
    """
    return (pred_time.hour == 0) & (pred_time.day == 1)  # retrain every month start (for freq="H")

This adds a layer of complexity for the user, yet it is very flexible to similate and backtest what would happen in a production environment since it leaves the possibility to implement logics on the data series.

As an example, we retrain models every month first OR when we get some kind of alert on the data quality/drift

Aug 05 '22 08:08 FBruzzesi

Hi @FBruzzesi, thanks for the suggestion. That sounds like a good idea too! A paramount objective in Darts is to try to always keep it simple for users by default. So how about for instance mixing several of those ideas as follows:

If retrain is an int (or a bool), it determines the frequency (in number of timestamps) at which to retrain the model (basically as proposed by @chefPony). By default we can have retrain=1.
If retrain is a callable, we treat it as you suggest and call it at every timestamp to decide whether to retrain.

Would this sound good?

We'd be happy to receive a PR around that :)

Aug 07 '22 13:08 hrzn

Thanks for the feedback @hrzn, I completely agree that allowing for both implementations (int and callable) would lead to simplicity and flexibility at the same time, and require a contained implementation effort.

I can work on this issue, but before assigning it I would like to understand correctly few small implementation details, namely:

I believe we should agree on which arguments the callable can take. My suggestion would be:
- counter (int): iteration counter as described by @chefPony
- pred_time (pd.Timestamp): the current timestamp of iteration
- train (TimeSeries): the timeseries up to pred_time
- past_covariates (TimeSeries): the timeseries of past covariates up to pred_time
- future_covariates (TimeSeries): the timeseries of future covariate up to pred_time + forecast_horizon
I would consider writing a decorator that checks which arguments/signature the callable has, and add the unused arguments of the above list as None's. Let me justify this with a simple example: if all the logic I am implementing is based only one of the above arguments, from an user perspective it becomes cumbersome to add all unused arguments to the function, i.e.:

This is what I want to write
```
def retrain(pred_time: pd.Timestamp):
    return (pred_time.hour == 0) & (pred_time.day == 1)  # retrain every month start (for freq="H")
```
This is what I want to avoid writing
```
def retrain(
    counter: int = None,
    pred_time: pd.Timestamp = None,
    train_series: TimeSeries = None,
    past_covariates: TimeSeries = None,
    future_covariates: TimeSeries = None):
    return (pred_time.hour == 0) & (pred_time.day == 1)  # retrain every month start (for freq="H")
```
I am not sure if this is possible to do, but I would like to investigate it.
Last doubt, I am not sure how to proceed to test this feature improvement. Any suggestion/recommandation?

Aug 08 '22 09:08 FBruzzesi

Thanks for the feedback @hrzn, I completely agree that allowing for both implementations (int and callable) would lead to simplicity and flexibility at the same time, and require a contained implementation effort.

I can work on this issue, but before assigning it I would like to understand correctly few small implementation details, namely:
I believe we should agree on which arguments the callable can take. My suggestion would be:

counter (int): iteration counter as described by @chefPony

pred_time (pd.Timestamp): the current timestamp of iteration

train (TimeSeries): the timeseries up to pred_time

past_covariates (TimeSeries): the timeseries of past covariates up to pred_time

future_covariates (TimeSeries): the timeseries of future covariate up to pred_time + forecast_horizon
I would consider writing a decorator that checks which arguments/signature the callable has, and add the unused arguments of the above list as None's. Let me justify this with a simple example: if all the logic I am implementing is based only one of the above arguments, from an user perspective it becomes cumbersome to add all unused arguments to the function, i.e.: This is what I want to write
def retrain(pred_time: pd.Timestamp):
    return (pred_time.hour == 0) & (pred_time.day == 1)  # retrain every month start (for freq="H")
This is what I want to avoid writing
def retrain(
    counter: int = None,
    pred_time: pd.Timestamp = None,
    train_series: TimeSeries = None,
    past_covariates: TimeSeries = None,
    future_covariates: TimeSeries = None):
    return (pred_time.hour == 0) & (pred_time.day == 1)  # retrain every month start (for freq="H")
I am not sure if this is possible to do, but I would like to investigate it.
Last doubt, I am not sure how to proceed to test this feature improvement. Any suggestion/recommandation?

Thanks, overall what you propose for filling in default arguments definitely makes sense (and I think that should be possible). A couple of small remarks:

Do you think we need counter: int as a parameter? It seems to me this case would be covered by the case where retrain is an int already.
pred_time should be of type Union[pd.Timestamp, int], as TimeSeries can be integer-indexed too.
past_covariates and future_covariates can have larger time spans than what you mention (this is because Darts will slice them as required to keep only the relevant parts) - that's a detail and not a concern for this particular task, though.

Concerning testing, I think you can write a new test function e.g., in test_local_forecasting_models.py that runs the backtest with a custom function (such as the one you give in example), and then check that the result(s) are as expected with a fixed random seed.

Aug 08 '22 14:08 hrzn

Do you think we need counter: int as a parameter? It seems to me this case would be covered by the case where retrain is an int already.

Yes yes! You are absolutely right, it's already covered in such case, I was just trying to put everything together in the same functionality.

pred_time should be of type Union[pd.Timestamp, int], as TimeSeries can be integer-indexed too.

Maybe I am missing something but, while start as argument can be Union[pd.Timestamp, int, float], then it gets converted to pd.Timestamp, (in start = series.get_timestamp_at_point(start)) and from there onwards pred_times is of type List[pd.Timestamp] . Hence for pred_time in iterator should yield all pd.Timestamp variables.

past_covariates and future_covariates can have larger time spans than what you mention (this is because Darts will slice them as required to keep only the relevant parts) - that's a detail and not a concern for this particular task, though.

Yes of course, but let's say I want to retrain on some condition for past_covariates, instead of reasoning on the whole series, I find it easier to reason about the series until now (and respectively for the series of future covariates until I have information for the next prediction). As an example, let's say I would like to retrain the model each time that some past covariates is below a certain value, then the retrain function should be something like:

def retrain(past_covariates: TimeSeries) -> bool:
    return past_covariates["some_col_name"].values()[-1] < some_threshold_value

where the past_covariates argument for retrain should be past_covariates.drop_after(pred_time), hence recomputed at each value of pred_time, but of course not overwriting the original argument (in the same way that train is computed). Does this make sense?

Addressing test(s): I may need some help; actually all we need to test for is that the historical_forecasts method enters the retrain condition the on the expected condition, yet I am not sure on how to proceed.

Finally, I started working on the retrain function decorator:

Should that be added in utils/utils.py as the other (check) decorators?
Do you have any suggestion for the name? I was thinking of _retrain_wrapper

Aug 08 '22 15:08 FBruzzesi

I would consider writing a decorator that checks which arguments/signature the callable has, and add the unused arguments of the above list as None's. Let me justify this with a simple example: if all the logic I am implementing is based only one of the above arguments, from an user perspective it becomes cumbersome to add all unused arguments to the function, i.e.:
def retrain(pred_time: pd.Timestamp):
   return (pred_time.hour == 0) & (pred_time.day == 1)  # retrain every month start (for freq="H")
This is what I want to avoid writing
def retrain(
   counter: int = None,
   pred_time: pd.Timestamp = None,
   train_series: TimeSeries = None,
   past_covariates: TimeSeries = None,
   future_covariates: TimeSeries = None):
  return (pred_time.hour == 0) & (pred_time.day == 1)  # retrain every month start (for freq="H")
I am not sure if this is possible to do, but I would like to investigate it.

Am I missing something or you would still need to pass all the arguments to _retrain_wrapper inside historical_forecasts? i.e:

if _retrain_wrapper(counter, pred_time, train_series, past_covariates, future_covariates) or not self._fit_called:
    self._fit_wrapper(
        series=train,
        past_covariates=past_covariates,
        future_covariates=future_covariates,
    )

If yes, we could just extend the signature of the user provided callable with kwargs and ignore whatever keyword argument is not in the original signature. Something like:

from inspect import signature

def extend_signature(fun):
    def retrain_wrapper(**kwargs):
        fun_param = {param:value for param, value in kwargs.items() if param in signature(fun).parameters}
        return fun(**fun_param)
    return retrain_wrapper

@extend_signature
def retrain_every_5(counter):
    return (counter % 5 == 0)

retrain_every_5(counter=10) #=> True
retrain_every_5(counter=5, pred_time=pd.Timestamp.today()) #=> True
retrain_every_5(pred_time=pd.Timestamp.today()) #=> Error missing parameter counter

Aug 08 '22 18:08 chefPony

@chefPony

Am I missing something or you would still need to pass all the arguments to _retrain_wrapper inside historical_forecasts?

Yes correct, that's exactly what I was thinking, passing all the agreed upon arguments and ignoring the ones not in the function original signature. That's the reasoning behind this

I believe we should agree on which arguments the callable can take. My suggestion would be:

counter (int): iteration counter as described by @chefPony

pred_time (pd.Timestamp): the current timestamp of iteration

train (TimeSeries): the timeseries up to pred_time

past_covariates (TimeSeries): the timeseries of past covariates up to pred_time

future_covariates (TimeSeries): the timeseries of future covariate up to pred_time + forecast_horizon

Please let me know if you are working on this or I can proceed with the pull request. I am left with tests implementation and the other minor details asked above.

Aug 09 '22 07:08 FBruzzesi

Please let me know if you are working on this or I can proceed with the pull request. I am left with tests implementation and the other minor details asked above.

Go ahead ;)

Aug 09 '22 08:08 chefPony

Maybe I am missing something but, while start as argument can be Union[pd.Timestamp, int, float], then it gets converted to pd.Timestamp, (in start = series.get_timestamp_at_point(start)) and from there onwards pred_times is of type List[pd.Timestamp] . Hence for pred_time in iterator should yield all pd.Timestamp variables.

series.get_timestamp_at_point(start) can return a pd.Timestamp or an int, as some TimeSeries are not indexed by pd.Timestamp at all, but simply by integers. So this type should really by Union[pd.Timestamp, int]; I'll bring these comments to the PR directly.

Aug 10 '22 13:08 hrzn

Released in v0.22.0 🚀

Jan 05 '23 10:01 hrzn