darts icon indicating copy to clipboard operation
darts copied to clipboard

Change create_lagged_data and regressionmodel in order to have the names of the features by default in the sklearn model

Open dumjax opened this issue 2 years ago • 3 comments

dumjax avatar Oct 11 '22 12:10 dumjax

I had a look on how it could be implemented ; the training_samples array in _fit_model() can be converted into a pd.DataFrame (or similar data structure) with the values generated by create_lagged_component_names() as columns. A similar approach must be used in predict() to avoid the warnings generated by sklearn when the features array passed to model.predict() do not have columns names.

I don't find this approach elegant, and not sure if the benefits are worth the trouble. WDYT @dennisbader?

madtoinou avatar Nov 24 '23 12:11 madtoinou

If converting to a pd.DataFrame then I'd rather avoid it (thinking about computational cost).

dennisbader avatar Dec 01 '23 13:12 dennisbader

I mean, xarray.DataArray could also be used since it also allow for labeled dimensions without adding too much computational overhead?

Should I run a quick benchmark to assess the impact on the performance?

madtoinou avatar Dec 01 '23 13:12 madtoinou