darts icon indicating copy to clipboard operation
darts copied to clipboard

[Question] How to perform temporal embedding in DARTS?

Open guimalo opened this issue 10 months ago • 2 comments

I want to understand how can I use date information as features for my machine learning model on DARTS. I want to create new information from dates and use them as columns on a regression-based forecaster. I'm quite confused by the covariates terminology and it isn't really clear what is happening under the hood on darts when using add_encoders.

In my case, I do not have any exogenous variables (past/future covariates), I just want to try to capture seasonality using temporal embedding features, like sine and cosine, for training and also for inference. For example, let's say that I have the following data:

Date Sales
2024-04-01 100
2024-04-08 150
2024-04-15 200
2024-04-22 180
2024-04-29 220
2024-05-06 250
2024-05-13 280
2024-05-20 300
2024-05-27 320
2024-06-03 350
2024-06-10 380
2024-06-17 400

How do I get from this series and create features like 'year', 'month_of_year', 'week_of_year', 'day_of_year', 'month_of_quarter', 'week_of_quarter', 'day_of_quarter', 'week_of_month' for training and inference? Is there an easy way to do this on DARTS?

I'm talking here about date features, but the documentation also does not make it quite clear for me how DARTS handles ML forecasting in general. The template example is as follows (here using CatBoost):

target = series['p (mbar)'][:100]
# optionally, use past observed rainfall (pretending to be unknown beyond index 100)
past_cov = series['rain (mm)'][:100]
# optionally, use future temperatures (pretending this component is a forecast)
future_cov = series['T (degC)'][:106]
# predict 6 pressure values using the 12 past values of pressure and rainfall, as well as the 6 temperature
# values corresponding to the forecasted period
model = CatBoostModel(
    lags=12,
    lags_past_covariates=12,
    lags_future_covariates=[0,1,2,3,4,5],
    output_chunk_length=6
)
model.fit(target, past_covariates=past_cov, future_covariates=future_cov)
pred = model.predict(6)

What does it mean to use the 12 past values of pressure and rainfall? What about the 88 other data points? How does the library actually do the calculations for the data?

Thank you.

guimalo avatar Apr 23 '24 17:04 guimalo

Also, I have questions of how to create a pipeline where I deseasonalize and detrend my data, make the desired forecasts, and add back these transformations to the forecasted data. Is it possible to do that in an easy manner?

guimalo avatar Apr 23 '24 18:04 guimalo

Hi @guimalo,

When you assign a value to add_encoders, the model will create the corresponding covariates "on the fly" during training/inference. In your case, since you are trying to encode information about the time axis, it can be considered as future covariates (we know in advance which day of the week/month of the year each timestamp will be at for an arbitrary number of steps). You can see them as "implicit" covariates, handled for you under the hood. If you prefer, you can of course create the encoders manually and explicitly set the covariates to the TimeSeries returned:

from darts.dataprocessing.encoders.encoders import FutureCyclicEncoder
from darts.models import CatBoostModel
from darts.utils.timeseries_generation import sine_timeseries
from pandas import Timestamp

model = CatBoostModel(
    lags=[-5, -3, -1],
    output_chunk_length=2,
    lags_future_covariates=[-2, 0, 2])

encoder = FutureCyclicEncoder(
    attribute="month",
    input_chunk_length = abs(min(model._get_lags("target"))),
    output_chunk_length = model.output_chunk_length,
    lags_covariates = model._get_lags("future"),
    )

ts_target = tg.sine_timeseries(length=100, start=Timestamp("01-01-2000"))

axis_encoding = encoder.encode_train_inference(
    n=5,
    target=ts_target
)

model.fit(ts_target, future_covariates=axis_encoding)
model.predict(5)

You can create Pipeline, and if your transforms are invertible, you can transform your forecast back to the original range : example for the documentation.

madtoinou avatar Apr 24 '24 07:04 madtoinou