darts icon indicating copy to clipboard operation
darts copied to clipboard

Easy Sample Weights

Open Beerstabr opened this issue 2 years ago • 18 comments

Often in forecasting it makes sense to use sample weights that make your model focus more on the recent history. And with most Sklearn models you can introduce this through the fit method. It would be great if Darts could make it easy to implement sensible weighting schemes for forecasting such as an exponentially decaying weighting function.

Many thanks for the library!

Beerstabr avatar Aug 30 '22 07:08 Beerstabr

good idea, adding to the backlog :) (and contributions are welcome!)

hrzn avatar Aug 31 '22 13:08 hrzn

I would definitely like to contribute!

Beerstabr avatar Sep 01 '22 07:09 Beerstabr

I was thinking of solving it much like the _create_lagged_data function from RegressionModel class.

Starting out with three options:

  • equal weights
  • linearly decaying weights
  • exponentially decaying weights like image

Beerstabr avatar Sep 06 '22 07:09 Beerstabr

Hi @Beerstabr, after checking, I think it should already be possible to do something like this:

my_model = RegressionModel(..., lags=n_lags)
my_model.fit(..., sample_weight=[1. / (in_len - i) for i in range(n_lags)])

because all the kwargs received by fit() are passed to the underlying estimator's fit() method.

hrzn avatar Sep 07 '22 12:09 hrzn

Hi @hrzn, yes that's true. That's how I am currently doing it.

However, it gets slightly more complicated when you start using lags, an output_chunk_length >1 or start training on multiple series (and there's probably other things to consider as well).

For example, when you use 8 lags your series gets cut short by 8 data points. In that case I think it should be:

my_model = RegressionModel(..., lags=n_lags, input_chunk_length=in_len)
my_model.fit(..., sample_weight=[1. / (in_len - i) for i in range(in_len - n_lags)])

And, if you want both lags and output_chunk_length > 1, then I believe it should be:

my_model = RegressionModel(..., lags=n_lags, output_chunk_length=out_len, input_chunk_length=in_len)
my_model.fit(..., sample_weight=[1. / (in_len - i) for i in range(in_len - np.max([n_lags, out_len])])

And finally, when you're training on multiple series and these series differ in length, it gets a bit more complicated. It that case you'll need to take into account the order and the difference in length of the series. For example, in the case of exponentially decaying weights it could be like this:

# function for calculating exponential weights
def exponential_sample_weights(ts, n_lags=8, multiple_series=False, max_series_length=np.nan):
  
    if not multiple_series:
        T = len(ts) - n_lags
        sample_weights = [-np.log(1-t/T)/(T-1) for t in range(1,T+1) if t<T] + [np.log(T)/(T-1)]
    else:
        T = max_series_length - n_lags
        T_self = len(ts) - n_lags
        sample_weights = [-np.log(1-t/T)/(T-1) for t in range(1 +(T-T_self),T+1) if t<T] + [np.log(T)/(T-1)]
    
    return sample_weights

# create a list with the weights in the same order as the series to which they belong
seq_sample_weights = []
max_len = np.max([len(series) for series in seq_series])
for series in seq_series:
        seq_sample_weights_sample_weights += exponential_sample_weights(ts=series, 
                                                                        multiple_series=True, 
                                                                        max_series_length=max_len)

# fit the model (without a specific input_chunk_length)
my_model = RegressionModel(..., lags=n_lags)
my_model.fit(..., sample_weight=seq_sample_weights)

In the latter case you have to be very mindful of the fact that if series differ in length and you train on multiple series, then when you calculate exponentially decaying weights T should be the same for all series if you want to put equal weight on the series.

So, if you want to apply sample weights and use Darts, currently it requires of you that you know what happens behind the scenes. Otherwise it's hard to get it going in the non-trivial cases and it's easy to make mistakes. Therefore I think it would be nice to have an easier way of doing it like:

my_model = RegressionModel(..., lags=n_lags)
my_model.fit(..., sample_weight_type='exponential')

Later on you could also add functionality to let the model focus more on specific series. But I would say that's of lesser importance.

Beerstabr avatar Sep 08 '22 10:09 Beerstabr

Hi @Beerstabr , first off, I'm sorry because I realised I made a mistake in my previous message - the sample_weight are (obviously) per-sample weights and not per-dimension weights, as I was too quick to assume. Indeed the actual number of samples is a relatively non-trivial function of the input chunk length (or nr. of lags used on the target), the number of targets, and potentially the parameter max_samples_per_ts. Then once all samples are built (this is done in the function RegressionModel._create_lagged_data(), the weights should be assigned to them as a function of how far in the past the lag of the y column corresponds to.

I think it can be done and it could be a pretty nice feature indeed. However it would also add a bit of complexity, because it would be strongly coupled to the tabularization logic. Nevertheless, if you feel like tackling it, we would be very happy to receive a PR in this direction. However, I would recommend that you wait a little before you start, as we have another couple of initiatives ongoing that are touching the tabularization itself, so it'd be better to do it afterwards to avoid conflicts.

hrzn avatar Sep 12 '22 09:09 hrzn

Hi @hrzn,

Seems to me like a fun challenge to tackle. I’ll wait for the right moment though. How will I know the ongoing initiatives will be done? Are their specific backlog items I can follow?

Beerstabr avatar Sep 13 '22 08:09 Beerstabr

Hi @Beerstabr,

The PR refactoring the taburalization has been merged. If you're still interested in implementing this feature, it's more than welcome!

madtoinou avatar Mar 23 '23 14:03 madtoinou

Definitely! It’s really just scratching my own itch, because I would love to use the feature myself.

Op do 23 mrt. 2023 om 15:36 schreef madtoinou @.***>

Hi @Beerstabr https://github.com/Beerstabr,

The PR refactoring the taburalization has been merged. If you're still interested in implementing this feature, it's more than welcome!

— Reply to this email directly, view it on GitHub https://github.com/unit8co/darts/issues/1175#issuecomment-1481310866, or unsubscribe https://github.com/notifications/unsubscribe-auth/AZ26ZSN62CQJPORUZFHE24DW5RNVJANCNFSM6AAAAAAQADQQGA . You are receiving this because you were mentioned.Message ID: @.***>

Beerstabr avatar Mar 23 '23 14:03 Beerstabr

Hi! I would highly appreciate this feature as well. I currently pass sample weights to the fit method the following way:

  1. create darts timeseries with sample weights (in my case a list of timeseries)
  2. recomputing the _get_feature_times and get_shared_times from the tabularization module (very redundant)
  3. slicing the sample weights (darts timeseries) based on the shared times
  4. converting it into a numpy array
  5. passing it it to sample_weight as additional keyword arguments passed to the fit method of the underlying model

daniel-ressi avatar Oct 03 '23 09:10 daniel-ressi

This would be a big addition. Weights would also allow alternative was to handle missing values; e.g., https://cienciadedatos.net/documentos/py46-forecasting-time-series-missing-values.html

gofford avatar Oct 05 '23 10:10 gofford

Hi! There is an idea to make the weights part of the TimeSeries class as an attribute for xarray (like a static covs or a hierarchy). I could contribute if the idea is valid

BohdanBilonoh avatar Apr 08 '24 19:04 BohdanBilonoh

There is an upcoming PR that will offer the possibility to either generate weights during tabularization or provide them as a TimeSeries when training the model. The logic is implemented, the contributor is now working on the tests.

I am not sure that adding it as an attribute of TimeSeries is the approach we want to take as they are immutable and one might be interested in testing several weighting approaches.

madtoinou avatar Apr 09 '24 07:04 madtoinou

Sounds interesting. My motivation was to make the sample weights part of the input and use them as weight_cols for TimeSeries.from_dataframe. This could allow all slicing logic to be hidden behind the TimeSeries class and allow weight values ​​not only per sample, but per timestamp and/or per component. Does this new logic you mentioned cover such abilities?

BohdanBilonoh avatar Apr 10 '24 06:04 BohdanBilonoh

The slicing logic will be hidden, but in the tabularization.

The upcoming implementation allows to associate a weight with each timestamp, which is then converted to samples weights. I don't see how weighting could be performed on the component dimensions, would you mind describing how this can be leveraged?

madtoinou avatar Apr 10 '24 06:04 madtoinou

It will be interesting to see the code of the new logic.

Very simple example: E-commerce time series that contain revenue and margin as targets and have to be predicted simultaneously (using TiDE model) but revenue is more important that margin

BohdanBilonoh avatar Apr 10 '24 07:04 BohdanBilonoh

I will make sure that the PR implementing this new feature will be linked to this PR.

I think that this kind of "bias" should come from the loss/objective function, it's not really possible to influence a model to favor the optimization of one target component over another using another mechanism (at least to my knowledge). The model is usually responsible for identifying the most informative features (lags/components).

madtoinou avatar Apr 10 '24 12:04 madtoinou

My vision of the sample weights was similar to the weights passed to Likelihood.compute_loss and in this scenario sample and/or timestamp and/or component could be weighted

BohdanBilonoh avatar Apr 10 '24 14:04 BohdanBilonoh