esrnn_torch icon indicating copy to clipboard operation
esrnn_torch copied to clipboard

Input for OWA Calculation - AttributeError

Open leschaf opened this issue 3 years ago • 2 comments

Hi,

I was searching for OWA implementations to use the measure in one of my projects. I'm starting with calculation for a single time series. The ESRNN lib provides this function:

final_owa, final_mase, final_smape = evaluate_prediction_owa(y_hat_df, y_train_df, X_test_df, y_test_df, naive2_seasonality=1)

When I look at the code (https://github.com/kdgutier/esrnn_torch/blob/master/ESRNN/utils_evaluation.py#L370-L400), X_test_df is not used at all in the function - is that correct?

Also, I'm not sure about the input format for using the function.

Here is my input for y_train_df, which contains the historical target values:

unique_id ds y
00010_000030_BS824_2018-01-01 2017-12-01 1497400.0
00010_000030_BS824_2018-01-01 2017-11-01 1707420.0
00010_000030_BS824_2018-01-01 2017-10-01 1989485.0
00010_000030_BS824_2018-01-01 2017-09-01 1697800.0
00010_000030_BS824_2018-01-01 2017-08-01 1574400.0
00010_000030_BS824_2018-01-01 2017-07-01 1260556.0
00010_000030_BS824_2018-01-01 2017-06-01 1319198.0
00010_000030_BS824_2018-01-01 2017-05-01 1592793.0
00010_000030_BS824_2018-01-01 2017-04-01 1575775.0
00010_000030_BS824_2018-01-01 2017-03-01 1808200.0
00010_000030_BS824_2018-01-01 2017-02-01 1365519.0
00010_000030_BS824_2018-01-01 2017-01-01 1904000.0
00010_000030_BS824_2018-01-01 2016-12-01 1713520.0
00010_000030_BS824_2018-01-01 2016-11-01 1908281.0
00010_000030_BS824_2018-01-01 2016-10-01 1737900.0
00010_000030_BS824_2018-01-01 2016-09-01 2005440.0
00010_000030_BS824_2018-01-01 2016-08-01 1683500.0
00010_000030_BS824_2018-01-01 2016-07-01 1179682.0
00010_000030_BS824_2018-01-01 2016-06-01 1834500.0
00010_000030_BS824_2018-01-01 2016-05-01 1949500.0
00010_000030_BS824_2018-01-01 2016-04-01 1811450.0
00010_000030_BS824_2018-01-01 2016-03-01 2001200.0
00010_000030_BS824_2018-01-01 2016-02-01 1273837.0

Here is the y_hat_df, which contains my model predictions for the future values:

 unique_id ds y_hat
00010_000030_BS824_2018-01-01 2018-01-01 1403634.0
00010_000030_BS824_2018-01-01 2018-02-01 1543464.0
00010_000030_BS824_2018-01-01 2018-03-01 1751357.0
00010_000030_BS824_2018-01-01 2018-04-01 1874214.0
00010_000030_BS824_2018-01-01 2018-05-01 1810092.0
00010_000030_BS824_2018-01-01 2018-06-01 1811571.0
00010_000030_BS824_2018-01-01 2018-07-01 1828860.0
00010_000030_BS824_2018-01-01 2018-08-01 1708163.0
00010_000030_BS824_2018-01-01 2018-09-01 1672521.0
00010_000030_BS824_2018-01-01 2018-10-01 1809456.0
00010_000030_BS824_2018-01-01 2018-11-01 1870753.0
00010_000030_BS824_2018-01-01 2018-12-01 1596886.0
00010_000030_BS824_2018-01-01 2019-01-01 1253630.0
00010_000030_BS824_2018-01-01 2019-02-01 1618861.0
00010_000030_BS824_2018-01-01 2019-03-01 1466855.0
00010_000030_BS824_2018-01-01 2019-04-01 1677125.0
00010_000030_BS824_2018-01-01 2019-05-01 1887335.0
00010_000030_BS824_2018-01-01 2019-06-01 1576052.0

And finally, here is my y_test_df, which contains the true future values with the same dates as in y_hat_df:

unique_id ds y
00010_000030_BS824_2018-01-01 2018-01-01 2237400.0
00010_000030_BS824_2018-01-01 2018-02-01 1967330.0
00010_000030_BS824_2018-01-01 2018-03-01 1886660.0
00010_000030_BS824_2018-01-01 2018-04-01 1818600.0
00010_000030_BS824_2018-01-01 2018-05-01 2060476.0
00010_000030_BS824_2018-01-01 2018-06-01 1928000.0
00010_000030_BS824_2018-01-01 2018-07-01 1506416.0
00010_000030_BS824_2018-01-01 2018-08-01 1705200.0
00010_000030_BS824_2018-01-01 2018-09-01 1602600.0
00010_000030_BS824_2018-01-01 2018-10-01 2002980.0
00010_000030_BS824_2018-01-01 2018-11-01 1829730.0
00010_000030_BS824_2018-01-01 2018-12-01 1385800.0
00010_000030_BS824_2018-01-01 2019-01-01 1923362.0
00010_000030_BS824_2018-01-01 2019-02-01 1849415.0
00010_000030_BS824_2018-01-01 2019-03-01 1921600.0
00010_000030_BS824_2018-01-01 2019-04-01 2143900.0
00010_000030_BS824_2018-01-01 2019-05-01 2014900.0
00010_000030_BS824_2018-01-01 2019-06-01 1832100.0

Upon calling evaluate_prediction_owa I get, on this line: y_hat_id = y_hat_panel[top_row:bottom_row].y_hat.to_numpy() the following error - any idea why that happens? What am I missing?

AttributeError Traceback (most recent call last) ~/projects/semco/semicon-forecast/src/a4_benchmark.py in ----> 1 evaluate_prediction_owa(y_hat_df, y_train_df, 2 None, y_test_df, 3 naive2_seasonality=12) 4

~/miniconda3/envs/semicon/lib/python3.8/site-packages/ESRNN/utils_evaluation.py in evaluate_prediction_owa(y_hat_df, y_train_df, X_test_df, y_test_df, naive2_seasonality) 390 y_insample = y_train_df.filter(['unique_id', 'ds', 'y']) 391 --> 392 model_owa, model_mase, model_smape = owa(y_panel, y_hat_panel, 393 y_naive2_panel, y_insample, 394 seasonality=naive2_seasonality)

~/miniconda3/envs/semicon/lib/python3.8/site-packages/ESRNN/utils_evaluation.py in owa(y_panel, y_hat_panel, y_naive2_panel, y_insample, seasonality) 350 total_mase = evaluate_panel(y_panel, y_hat_panel, mase, 351 y_insample, seasonality) --> 352 total_mase_naive2 = evaluate_panel(y_panel, y_naive2_panel, mase, 353 y_insample, seasonality) 354 total_smape = evaluate_panel(y_panel, y_hat_panel, smape)

~/miniconda3/envs/semicon/lib/python3.8/site-packages/ESRNN/utils_evaluation.py in evaluate_panel(y_panel, y_hat_panel, metric, y_insample, seasonality) 316 top_row = np.asscalar(y_hat_panel['unique_id'].searchsorted(u_id, 'left')) 317 bottom_row = np.asscalar(y_hat_panel['unique_id'].searchsorted(u_id, 'right')) --> 318 y_hat_id = y_hat_panel[top_row:bottom_row].y_hat.to_numpy() 319 assert len(y_id)==len(y_hat_id) 320

~/miniconda3/envs/semicon/lib/python3.8/site-packages/pandas/core/generic.py in getattr(self, name) 5463 if self._info_axis._can_hold_identifiers_and_holds_name(name): 5464 return self[name] -> 5465 return object.getattribute(self, name) 5466 5467 def setattr(self, name: str, value) -> None:

AttributeError: 'DataFrame' object has no attribute 'y_hat'

leschaf avatar Feb 14 '22 16:02 leschaf

Hi, the argument y_test_df is a pandas df panel with columns unique_id, ds, y, y_hat_naive2. So your y_test_df must include the naive 2 predictions to calculate the owa.

AzulGarza avatar Feb 14 '22 18:02 AzulGarza

Thank you - that helped!

Any comment on the use of X_test_df?

leschaf avatar Feb 16 '22 07:02 leschaf