statsforecast icon indicating copy to clipboard operation
statsforecast copied to clipboard

ValueError: xreg is rank deficient

Open obiii opened this issue 10 months ago • 3 comments

What happened + What you expected to happen

Hi,

I am trying to use exogenous features for statsForecast.fit method. For some reason, I am unable to do so as it says:

ValueError: xreg is rank deficient

I amusing one-hot encoding for the month and due to that some columns have 0 throughout in the example data below, but with full data, I do not have any zero cols, and there are no constant columns in the data as well. Also, there are no features that have an absolute correlation of more than 0.7.

Versions / Dependencies

python: 311 statsForecast: 1.7.4

Reproduction script

models = [
            AutoARIMA(season_length=31, nmodels=94, allowdrift=True),
            # AutoCES(season_length=30),
            AutoETS(season_length=31),
            HoltWinters(season_length=31),
            MSTL(season_length=31, trend_forecaster=AutoARIMA(), alias="MSTL-ARIMA"),
            MSTL(season_length=31),
            # AutoTheta(season_length=31),
            # DOT(season_length=31),
            # SeasonalWindowAverage(
            #     window_size=60, season_length=30
            # ),
            # SeasonalWindowAverage(
            #     window_size=90, season_length=30, alias="SeasWA30-93"
            # ),
            # SeasonalWindowAverage(
            #     window_size=120, season_length=30, alias="SeasWA30-120"
            # ),
            RandomWalkWithDrift(),
            SeasonalNaive(season_length=31)
        ]

dc_models = StatsForecast(
                models=models,
                freq="D",
                n_jobs=-1,
                verbose=True
            )

data = {
    'ds': ['2024-04-01', '2024-04-02'],
    'unique_id': [1, 2],
    'y': [100, 200],
    'holiday': [0, 1],
    'daysuntilendmonth': [10, 9],
    'tax_return': [1, 0],
    'bailiff_finland': [0, 1],
    'salary': [5000, 6000],
    'day_of_week_0': [0, 0],
    'day_of_week_1': [0, 1],
    'day_of_week_2': [1, 0],
    'day_of_week_3': [0, 0],
    'day_of_week_4': [0, 0],
    'day_of_week_5': [0, 0],
    'day_of_week_6': [0, 0],
    'month_indicator_1': [0, 0],
    'month_indicator_2': [0, 0],
    'month_indicator_3': [0, 0],
    'month_indicator_4': [1, 1],
    'month_indicator_5': [0, 0],
    'month_indicator_6': [0, 0],
    'month_indicator_7': [0, 0],
    'month_indicator_8': [0, 0],
    'month_indicator_9': [0, 0],
    'month_indicator_10': [0, 0],
    'month_indicator_11': [0, 0],
    'month_indicator_12': [0, 0],
    'quarter_1': [0, 0],
    'quarter_2': [1, 1],
    'quarter_3': [0, 0],
    'quarter_4': [0, 0]
}
data= pd.DataFrame(data)
data['ds'] = pd.to_datetime(data['ds'])
exog = True
dc_models.fit(df = data if exog else data[['ds', 'unique_id', 'y']], prediction_intervals=None)

Updated: If I remove both the day_of_week and month_indicator one-hot encodings, it works. But I am not sure what could be a reason behind this. Also, is there any other way to include month as it is an important feature.

Issue Severity

High: It blocks me from completing my task.

obiii avatar Apr 19 '24 07:04 obiii

Hey @obiii, thanks for using statsforecast. Can you please provide a minimal reproducible example? You can follow the tips here.

jmoralez avatar Apr 19 '24 17:04 jmoralez

Hey @obiii, thanks for using statsforecast. Can you please provide a minimal reproducible example? You can follow the tips here.

Hi @jmoralez I have updated the question now.

obiii avatar Apr 24 '24 13:04 obiii

Thanks! I believe this is due to the colinearity that the dummies introduce, can you try dropping one of the levels? i.e. use 6 dummies for day of week, 11 for month and 3 for quarters. You can read more about the problem here.

jmoralez avatar Apr 24 '24 16:04 jmoralez

This issue has been automatically closed because it has been awaiting a response for too long. When you have time to to work with the maintainers to resolve this issue, please post a new comment and it will be re-opened. If the issue has been locked for editing by the time you return to it, please open a new issue and reference this one.

github-actions[bot] avatar May 25 '24 04:05 github-actions[bot]