etna icon indicating copy to clipboard operation
etna copied to clipboard

[DRAFT]Make base class of Transforms work with TSDataset

Open alex-hse-repository opened this issue 2 years ago • 1 comments

🚀 Feature Request

We need to teach our transforms work with dataset instead of dataframe

Proposal

  1. In Transform:
  • Add method get_regressors_info() -> List[str]`

    • Return the list with added regressors
    • In default implementation returns []
  • Create method update_dataset(ts: TSDataset, df: pd.DataFrame, df_transformed: pd.DataFrame)

    • Add/update columns in ts using update_columns_from_pandas, if len(df.columns) <= len(df_transformed.columns)
    • Remove columns from ts using remove_columns, if len(df.columns) > len(df_transformed.columns)
    • Use the method get_regressors_info to get the information about the regressors if necessary
  • Create method fit(self, ts: TSDataset):

    • Make the current method private _fit(df: pd.DataFrame)
    • Get the dataframe with necessary columns form ts using method to_pandas
    • Use method _fit to fit the transform with this dataframe
  • Create method transform(self, ts: TSDataset):

    • Make the current method private _transform(df: pd.DataFrame)
    • Get the dataframe with necessary columns form ts using method to_pandas
    • Use method _transform to transform this dataframe
    • Use method update_dataset to update the columns in the dataset
  • Create method inverse_transform(self, ts: TSDataset):

    • Make the current method private _inverse_transform(df: pd.DataFrame)
    • Get the dataframe with necessary columns form ts using method to_pandas
    • Use method _inverse_transform to transform this dataframe
    • Use method update_dataset to update the columns in the dataset
  1. In TSDataset:
  • Fix all the places, where the transforms are called, now we need to pass self instead of self.df
  • Also transforms now make the inlace transformation

The necessary columns are defined using the in_column attribute, added in #810

Test cases

  1. Test that update_dataset works correctly in different cases(update_regressors + update columns)
  2. Test fit, transform, inverse_transform logic using mocks

Additional context

Is blocked by #810

alex-hse-repository avatar Jul 14 '22 05:07 alex-hse-repository

Special in_column handling in ResamplingTransform

alex-hse-repository avatar Jul 27 '22 12:07 alex-hse-repository