Extend backtesting methods to retrain after configurable number of time steps
See: https://github.com/unit8co/darts/issues/135#issuecomment-709979882
Hi, I would like to contribute to this issue. The solution I have in mind is very straightforward with minimal impact I think. The main method impacted will be historical_forecasts in the ForecastingModel base class, basically I would change method signature to: retrain: int = 1 (actual behaviour) instead of bool. Then, before the prediction loop at line 430, I would define a counter=0 variable which will get increment at each iteration. Finally I would change the train if at line 438 to something like if (retrain or not self._fit_called) and (counter % retrain == 0), and this should do it.
Let me know what you think, if you think this can work I can make a pr soon.
Thanks!
Hi @chefPony, thanks for your suggestion. That looks very good to me! Feel free to go ahead and open a PR (please check the guidelines before). Many thanks!
Hi @hrzn, I would like to propose a more general approach, which I find particularly useful for retraining. As a naive example to justify this, suppose that you want to retrain the model every 1st of the month, then the above approach wouldn't work.
The way to do that is to pass a callable to determine on which conditions you should retrain the model, leaving the user all the flexibility, i.e. something like:
if (not self._fit_called
or (retrain
and retrain_on_condition(pred_time, train_series, past_covariates, future_covariates))
):
# retrain the model here
Where the function retrain_on_condition should be provided by the user, here a list of few examples:
import pandas as pd
from darts import TimeSeries
def retrain_on_condition(
pred_time: pd.Timestamp = None,
train_series: TimeSeries = None,
past_covariates: TimeSeries = None,
future_covariates: TimeSeries = None
):
"""
Examples:
return pred_time.hour == 0 # retrain every midnight (for freq="H")
return pred_time.day == 1 # retrain every month start (for freq="D")
return (pred_time.hour == 0) & (pred_time.day == 1) # retrain every month start (for freq="H")
return (pred_time.hour == 0) & (pred_time.dayofweek == 0) # retrain every week (for freq="H")
"""
return (pred_time.hour == 0) & (pred_time.day == 1) # retrain every month start (for freq="H")
This adds a layer of complexity for the user, yet it is very flexible to similate and backtest what would happen in a production environment since it leaves the possibility to implement logics on the data series.
As an example, we retrain models every month first OR when we get some kind of alert on the data quality/drift
Hi @FBruzzesi, thanks for the suggestion. That sounds like a good idea too! A paramount objective in Darts is to try to always keep it simple for users by default. So how about for instance mixing several of those ideas as follows:
- If
retrainis anint(or abool), it determines the frequency (in number of timestamps) at which to retrain the model (basically as proposed by @chefPony). By default we can haveretrain=1. - If
retrainis a callable, we treat it as you suggest and call it at every timestamp to decide whether to retrain.
Would this sound good?
We'd be happy to receive a PR around that :)
Thanks for the feedback @hrzn, I completely agree that allowing for both implementations (int and callable) would lead to simplicity and flexibility at the same time, and require a contained implementation effort.
I can work on this issue, but before assigning it I would like to understand correctly few small implementation details, namely:
-
I believe we should agree on which arguments the callable can take. My suggestion would be:
counter(int): iteration counter as described by @chefPonypred_time(pd.Timestamp): the current timestamp of iterationtrain(TimeSeries): the timeseries up topred_timepast_covariates(TimeSeries): the timeseries of past covariates up topred_timefuture_covariates(TimeSeries): the timeseries of future covariate up topred_time + forecast_horizon
-
I would consider writing a decorator that checks which arguments/signature the callable has, and add the unused arguments of the above list as
None's. Let me justify this with a simple example: if all the logic I am implementing is based only one of the above arguments, from an user perspective it becomes cumbersome to add all unused arguments to the function, i.e.:This is what I want to write
def retrain(pred_time: pd.Timestamp): return (pred_time.hour == 0) & (pred_time.day == 1) # retrain every month start (for freq="H")This is what I want to avoid writing
def retrain( counter: int = None, pred_time: pd.Timestamp = None, train_series: TimeSeries = None, past_covariates: TimeSeries = None, future_covariates: TimeSeries = None): return (pred_time.hour == 0) & (pred_time.day == 1) # retrain every month start (for freq="H")I am not sure if this is possible to do, but I would like to investigate it.
-
Last doubt, I am not sure how to proceed to test this feature improvement. Any suggestion/recommandation?
Thanks for the feedback @hrzn, I completely agree that allowing for both implementations (
intandcallable) would lead to simplicity and flexibility at the same time, and require a contained implementation effort.I can work on this issue, but before assigning it I would like to understand correctly few small implementation details, namely:
I believe we should agree on which arguments the callable can take. My suggestion would be:
counter(int): iteration counter as described by @chefPonypred_time(pd.Timestamp): the current timestamp of iterationtrain(TimeSeries): the timeseries up topred_timepast_covariates(TimeSeries): the timeseries of past covariates up topred_timefuture_covariates(TimeSeries): the timeseries of future covariate up topred_time + forecast_horizonI would consider writing a decorator that checks which arguments/signature the callable has, and add the unused arguments of the above list as
None's. Let me justify this with a simple example: if all the logic I am implementing is based only one of the above arguments, from an user perspective it becomes cumbersome to add all unused arguments to the function, i.e.: This is what I want to writedef retrain(pred_time: pd.Timestamp): return (pred_time.hour == 0) & (pred_time.day == 1) # retrain every month start (for freq="H")This is what I want to avoid writing
def retrain( counter: int = None, pred_time: pd.Timestamp = None, train_series: TimeSeries = None, past_covariates: TimeSeries = None, future_covariates: TimeSeries = None): return (pred_time.hour == 0) & (pred_time.day == 1) # retrain every month start (for freq="H")I am not sure if this is possible to do, but I would like to investigate it.
Last doubt, I am not sure how to proceed to test this feature improvement. Any suggestion/recommandation?
Thanks, overall what you propose for filling in default arguments definitely makes sense (and I think that should be possible). A couple of small remarks:
- Do you think we need
counter: intas a parameter? It seems to me this case would be covered by the case whereretrainis anintalready. pred_timeshould be of typeUnion[pd.Timestamp, int], asTimeSeriescan be integer-indexed too.past_covariatesandfuture_covariatescan have larger time spans than what you mention (this is because Darts will slice them as required to keep only the relevant parts) - that's a detail and not a concern for this particular task, though.
Concerning testing, I think you can write a new test function e.g., in test_local_forecasting_models.py that runs the backtest with a custom function (such as the one you give in example), and then check that the result(s) are as expected with a fixed random seed.
Do you think we need
counter: intas a parameter? It seems to me this case would be covered by the case whereretrainis anintalready.
Yes yes! You are absolutely right, it's already covered in such case, I was just trying to put everything together in the same functionality.
pred_timeshould be of typeUnion[pd.Timestamp, int], asTimeSeriescan be integer-indexed too.
Maybe I am missing something but, while start as argument can be Union[pd.Timestamp, int, float], then it gets converted to pd.Timestamp, (in start = series.get_timestamp_at_point(start)) and from there onwards pred_times is of type List[pd.Timestamp] . Hence for pred_time in iterator should yield all pd.Timestamp variables.
past_covariates and future_covariates can have larger time spans than what you mention (this is because Darts will slice them as required to keep only the relevant parts) - that's a detail and not a concern for this particular task, though.
Yes of course, but let's say I want to retrain on some condition for past_covariates, instead of reasoning on the whole series, I find it easier to reason about the series until now (and respectively for the series of future covariates until I have information for the next prediction).
As an example, let's say I would like to retrain the model each time that some past covariates is below a certain value, then the retrain function should be something like:
def retrain(past_covariates: TimeSeries) -> bool:
return past_covariates["some_col_name"].values()[-1] < some_threshold_value
where the past_covariates argument for retrain should be past_covariates.drop_after(pred_time), hence recomputed at each value of pred_time, but of course not overwriting the original argument (in the same way that train is computed). Does this make sense?
Addressing test(s): I may need some help; actually all we need to test for is that the historical_forecasts method enters the retrain condition the on the expected condition, yet I am not sure on how to proceed.
Finally, I started working on the retrain function decorator:
- Should that be added in
utils/utils.pyas the other (check) decorators? - Do you have any suggestion for the name? I was thinking of
_retrain_wrapper
I would consider writing a decorator that checks which arguments/signature the callable has, and add the unused arguments of the above list as None's. Let me justify this with a simple example: if all the logic I am implementing is based only one of the above arguments, from an user perspective it becomes cumbersome to add all unused arguments to the function, i.e.:
def retrain(pred_time: pd.Timestamp): return (pred_time.hour == 0) & (pred_time.day == 1) # retrain every month start (for freq="H")This is what I want to avoid writing
def retrain( counter: int = None, pred_time: pd.Timestamp = None, train_series: TimeSeries = None, past_covariates: TimeSeries = None, future_covariates: TimeSeries = None): return (pred_time.hour == 0) & (pred_time.day == 1) # retrain every month start (for freq="H")I am not sure if this is possible to do, but I would like to investigate it.
Am I missing something or you would still need to pass all the arguments to _retrain_wrapper inside historical_forecasts? i.e:
if _retrain_wrapper(counter, pred_time, train_series, past_covariates, future_covariates) or not self._fit_called:
self._fit_wrapper(
series=train,
past_covariates=past_covariates,
future_covariates=future_covariates,
)
If yes, we could just extend the signature of the user provided callable with kwargs and ignore whatever keyword argument is not in the original signature. Something like:
from inspect import signature
def extend_signature(fun):
def retrain_wrapper(**kwargs):
fun_param = {param:value for param, value in kwargs.items() if param in signature(fun).parameters}
return fun(**fun_param)
return retrain_wrapper
@extend_signature
def retrain_every_5(counter):
return (counter % 5 == 0)
retrain_every_5(counter=10) #=> True
retrain_every_5(counter=5, pred_time=pd.Timestamp.today()) #=> True
retrain_every_5(pred_time=pd.Timestamp.today()) #=> Error missing parameter counter
@chefPony
Am I missing something or you would still need to pass all the arguments to _retrain_wrapper inside historical_forecasts?
Yes correct, that's exactly what I was thinking, passing all the agreed upon arguments and ignoring the ones not in the function original signature. That's the reasoning behind this
I believe we should agree on which arguments the callable can take. My suggestion would be:
- counter (int): iteration counter as described by @chefPony
- pred_time (pd.Timestamp): the current timestamp of iteration
- train (TimeSeries): the timeseries up to pred_time
- past_covariates (TimeSeries): the timeseries of past covariates up to pred_time
- future_covariates (TimeSeries): the timeseries of future covariate up to pred_time + forecast_horizon
Please let me know if you are working on this or I can proceed with the pull request. I am left with tests implementation and the other minor details asked above.
Please let me know if you are working on this or I can proceed with the pull request. I am left with tests implementation and the other minor details asked above.
Go ahead ;)
Maybe I am missing something but, while
startas argument can beUnion[pd.Timestamp, int, float], then it gets converted topd.Timestamp, (instart = series.get_timestamp_at_point(start)) and from there onwardspred_timesis of typeList[pd.Timestamp]. Hencefor pred_time in iteratorshould yield allpd.Timestampvariables.
series.get_timestamp_at_point(start) can return a pd.Timestamp or an int, as some TimeSeries are not indexed by pd.Timestamp at all, but simply by integers. So this type should really by Union[pd.Timestamp, int]; I'll bring these comments to the PR directly.
Released in v0.22.0 🚀