darts icon indicating copy to clipboard operation
darts copied to clipboard

add_datetime_attribute changed behavior in 0.28

Open connesy opened this issue 11 months ago • 4 comments

Describe the bug The values added to the timeseries in add_datetime_attribute changed from 0.27.2 to 0.28.0.

To Reproduce

import numpy as np
import pandas as pd
from darts import TimeSeries

raw = np.array([1, 2, 3, 4, 5])
data = pd.DataFrame(
    {"comp1": raw, "comp2": raw * 10},
    index=pd.date_range(start="2022-01-01", periods=len(raw), freq="D"),
)
timeseries = TimeSeries.from_dataframe(data)

timeseries = timeseries.add_datetime_attribute("day")
print(new_timeseries["day"].values().flatten())

# With u8darts==0.27.2
[1. 2. 3. 4. 5.]

# With u8darts==0.28.0
[0. 1. 2. 3. 4.]

timeseries = timeseries.add_datetime_attribute("month")
print(new_timeseries["month"].values().flatten())

# With u8darts==0.27.2
[1. 1. 1. 1. 1.]

# With u8darts==0.28.0
[0. 0. 0. 0. 0.]

Expected behavior I would expect the behavior from 0.27.2, since that is how the datetime attributes are specified in the Pandas DatetimeIndex documentation: https://pandas.pydata.org/pandas-docs/version/1.5/reference/api/pandas.DatetimeIndex.day.html#pandas.DatetimeIndex.day https://pandas.pydata.org/pandas-docs/version/1.5/reference/api/pandas.DatetimeIndex.month.html#pandas.DatetimeIndex.month

System (please complete the following information):

  • Python version: 3.10
  • darts version 0.28.0
  • pandas version: 1.5.3

connesy avatar Mar 11 '24 09:03 connesy

Hi @connesy, and thanks for this issue. It was actually an intended change to enforce 0-indexing now for all datetime attributes. We added it as a breaking change in fixed section of the CHANGELOG.

With this we wanted to overcome pandas "ambiguity" between 0- and 1-indexed attributes (e.g. weekday goes from 0-6, month from 1-12, ...).

This allows us to generate cyclic encodings that all start at index 0 and also to handle one hot encoding better.

Unless there is a strong reason to go back to the old behavior, I think we'll stick with these changes.

dennisbader avatar Mar 11 '24 11:03 dennisbader

Hi @dennisbader, thanks for the explanation. I can see how it makes sense to enforce the 0-indexing for encodings.

In that case, I would like to add a note to https://unit8co.github.io/darts/generated_api/darts.timeseries.html#darts.timeseries.TimeSeries.add_datetime_attribute that references the documentation for timeseries_generation.datetime_attribute_timeseries. Would that be okay?

connesy avatar Mar 11 '24 12:03 connesy

Hi @connesy, yes, that's a great idea. Do you want to raise a PR for that?

dennisbader avatar Mar 11 '24 12:03 dennisbader

Sure :+1:

connesy avatar Mar 11 '24 13:03 connesy