neuralforecast icon indicating copy to clipboard operation
neuralforecast copied to clipboard

[Core] Getting error when doing predict_insample

Open iamyihwa opened this issue 5 months ago • 13 comments

What happened + What you expected to happen

Hello, When I am doing insample forecast, I am getting this error. Exception: test_size - h should be module step_size

Exception                                 Traceback (most recent call last)
File <command-32941315092027>, line 1
----> 1 Y_hat_insample = nf.predict_insample(step_size = horizon)

File /local_disk0/.ephemeral_nfs/envs/pythonEnv-f5388d12-693f-46a8-9203-d411f41d9a38/lib/python3.10/site-packages/neuralforecast/core.py:601, in NeuralForecast.predict_insample(self, step_size)
    597 # Generate dates
    598 len_series = np.diff(
    599     trimmed_dataset.indptr
    600 )  # Computes the length of each time series based on indptr
--> 601 fcsts_df = _insample_dates(
    602     uids=self.uids,
    603     last_dates=last_dates_train,
    604     freq=self.freq,
    605     h=self.h,
    606     len_series=len_series,
    607     step_size=step_size,
    608 )
    609 fcsts_df = fcsts_df.set_index("unique_id")
    611 col_idx = 0

File /local_disk0/.ephemeral_nfs/envs/pythonEnv-f5388d12-693f-46a8-9203-d411f41d9a38/lib/python3.10/site-packages/neuralforecast/core.py:87, in _insample_dates(uids, last_dates, freq, h, len_series, step_size)
     81 """
     82 Generate insample dates for `predict_insample` function. Uses `_cv_dates`
     83 method with separate sizes and last dates for each series.
     84 """
     85 if (len(np.unique(last_dates)) == 1) and (len(np.unique(len_series)) == 1):
     86     # Dates can be generated simulatenously if ld and ls are the same for all series
---> 87     dates = _cv_dates(last_dates, freq, h, len_series[0], step_size)
     88     dates["unique_id"] = np.repeat(uids, len(dates) // len(uids))
     89 else:

File /local_disk0/.ephemeral_nfs/envs/pythonEnv-f5388d12-693f-46a8-9203-d411f41d9a38/lib/python3.10/site-packages/neuralforecast/core.py:44, in _cv_dates(last_dates, freq, h, test_size, step_size)
     41 def _cv_dates(last_dates, freq, h, test_size, step_size=1):
     42     # assuming step_size = 1
     43     if (test_size - h) % step_size:
---> 44         raise Exception("`test_size - h` should be module `step_size`")
     45     n_windows = int((test_size - h) / step_size) + 1
     46     if len(np.unique(last_dates)) == 1:

Exception: `test_size - h` should be module `step_size`

Versions / Dependencies

1.6.4

Reproduction script

from neuralforecast.losses.pytorch import RMSE
from neuralforecast.losses.pytorch import  HuberMQLoss, DistributionLoss
from neuralforecast import NeuralForecast
from neuralforecast.auto import TimesNet, AutoNHITS, AutoLSTM, AutoRNN
from neuralforecast.models import Informer, Autoformer, FEDformer, PatchTST
from neuralforecast.models import NHITS, GRU
quantiles = [0.5]
horizon = 13
nf = NeuralForecast(
    models= [
   
           NHITS(h=horizon,
              input_size=2*horizon,
              loss= HuberMQLoss(quantiles=quantiles), 
              dropout_prob_theta = 0.6,  # dropout to robustify vs outlier lag inputs 
             #stat_exog_list=['airline1'],
              n_freq_downsample=[2, 1, 1],
              scaler_type='robust',
          #    alias = 'NHITS',
              max_steps=200,
             # early_stop_patience_steps=2,
              inference_windows_batch_size=1,
             # val_check_steps=10,
              learning_rate=1e-3), 
   
          GRU(h=horizon,input_size=-1,
                loss=RMSE(),
                scaler_type='robust',
                encoder_n_layers=2,
                encoder_hidden_size=128,
                context_size=10,
                decoder_hidden_size=128,
                decoder_layers=2,
                max_steps=200,
                )
          
         
    ],
    freq= '4W-SAT'
    
)



nf.fit(train_df) # train_df) # _subset) 
preds_nf_df = nf.predict()

## Upto here works fine 

## From here getting error 
Y_hat_insample = nf.predict_insample(step_size = horizon)

Issue Severity

None

iamyihwa avatar Jan 18 '24 13:01 iamyihwa

Hi @iamyihwa ! The predict_insample method is internally specifying the entire length of the series as the test_size. We currently have a limitation that test_size-h (length - h) must be divisible by step_size, to avoid having forecasts past the last date. You will need to trim the time series slightly to account for this.

We know that this might be confusing, and we are working on removing this limitation!

cchallu avatar Jan 19 '24 16:01 cchallu

Thanks @cchallu and team for the great work! With regards to this,
Just to understand a bit better, I currently have the length of test_size and h the same size, so then test_size - h = 0 / X (whatever value it is?) Is this wrong ??

iamyihwa avatar Jan 19 '24 17:01 iamyihwa

The issue is that the predict insample internally sets test_size=series_length-true_test_size (where true_test_size is the one you defined), because it is forecasting the training data. This internal test_size should satisfy the condition. Is it clear? You need to trim the df dataset. We will fix this soon because it is confusing.

cchallu avatar Jan 21 '24 16:01 cchallu

Thanks @cchallu for explanation. However, not sure if I understood correctly so modified step_size to a different value ( =2) and it makes fitted forecast. In this case, train_set (51 time steps), test_set (13 time steps) . So then 51 - 13 = 38 and 38 is divisable by 2.

iamyihwa avatar Jan 22 '24 16:01 iamyihwa

@quest-bot stash 100

AzulGarza avatar Jan 25 '24 19:01 AzulGarza

New Quest! image New Quest!

A new Quest has been launched in @Nixtla’s repo. Merge a PR that solves this issue to loot the Quest and earn your reward.


Loot of 100 USD has been stashed in this issue to reward the solver!

🗡 Comment @quest-bot embark to check-in for this Quest and start solving the issue. Other solvers will be notified!

⚔️ When you submit a PR, comment @quest-bot loot #866 to link your PR to this Quest.

Questions? Check out the docs.

quest-bot[bot] avatar Jan 25 '24 19:01 quest-bot[bot]

Potential solution:

  • predict_insample always sets step_size=1 at the beginning of the function but stores the value set by the user.
  • the troubling condition will always be satisfied
  • at the end of the current function, filter the extra cutoff dates of the fcsts_df dataframe considering the initial step_size requested by user. For example, if step_size=3, keep dates 1, 4, 7, etc.

cchallu avatar Jan 26 '24 22:01 cchallu

@quest-bot embark

isaac-chung avatar Jan 27 '24 09:01 isaac-chung

@isaac-chung has embarked on their Quest 🗡

  • @isaac-chung has been on GitHub since 2019.
  • They have merged 93 public PRs in that time.
  • Their swords are blessed with Python and Shell magic ✨
  • They haven't contributed to this repo before.

Questions? Check out the docs.

quest-bot[bot] avatar Jan 27 '24 09:01 quest-bot[bot]

🧚 @isaac-chung has submitted PR https://github.com/Nixtla/neuralforecast/issues/881 and is claiming the loot.

Keep up the pace, or you'll be left in the shadows.

Questions? Check out the docs.

quest-bot[bot] avatar Jan 27 '24 15:01 quest-bot[bot]

@quest-bot embark

JQGoh avatar Mar 14 '24 00:03 JQGoh

@JQGoh has embarked on their Quest. 🗡

  • @JQGoh has been on GitHub since 2014.
  • They have merged 0 public PRs in that time.
  • Their swords are blessed with Python and Shell magic ✨
  • They haven't contributed to this repo before.

Questions? Check out the docs.

quest-bot[bot] avatar Mar 14 '24 00:03 quest-bot[bot]

🧚 @JQGoh has submitted PR https://github.com/Nixtla/neuralforecast/issues/933 and is claiming the loot.

Keep up the pace, or you'll be left in the shadows.

cc @isaac-chung

Questions? Check out the docs.

quest-bot[bot] avatar Mar 14 '24 19:03 quest-bot[bot]

Hello. when I import this in my code:

from neuralforecast.models.mqnhits.mqnhits import MQNHITS (pre-trained N-Hits model)

I have this error:


ModuleNotFoundError Traceback (most recent call last) in <cell line: 1>() ----> 1 from neuralforecast.models.mqnhits.mqnhits import MQNHITS

ModuleNotFoundError: No module named 'neuralforecast.models.mqnhits'

---------------------------------------------------------------------------
NOTE: If your import is failing due to a missing package, you can
manually install dependencies using either !pip or !apt.

To view examples of installing some common dependencies, click the
"Open Examples" button below.
---------------------------------------------------------------------------

Can you help me to fix this error?

ladan-gh avatar Mar 22 '24 09:03 ladan-gh

@ladan-gh You might be able to find more relevant advice/help in the Slack channel, as this issue is used to track the originally reported issue. The others could advice you better on this in Slack.

JQGoh avatar Mar 22 '24 23:03 JQGoh

@ladan-gh You might be able to find more relevant advice/help in the Slack channel, as this issue is used to track the originally reported issue. The others could advice you better on this in Slack.

Thanks, can you send me Slack channel? Because I don't have it.

ladan-gh avatar Mar 24 '24 11:03 ladan-gh

@ladan-gh it is mentioned in tha main page https://github.com/Nixtla/neuralforecast Do look for the Slack icon. Or this link should work: https://join.slack.com/t/nixtlacommunity/shared_invite/zt-2fft79p5v-z3BXNHuF7TMD3YNlT2Uu_A

JQGoh avatar Mar 24 '24 12:03 JQGoh

@ladan-gh it is mentioned in tha main page https://github.com/Nixtla/neuralforecast Do look for the Slack icon. Or this link should work: https://join.slack.com/t/nixtlacommunity/shared_invite/zt-2fft79p5v-z3BXNHuF7TMD3YNlT2Uu_A

Thanks a lot🙏🏻.

ladan-gh avatar Mar 24 '24 15:03 ladan-gh