timeseries-forecast d parameter does not work well

d parameter does not work well

Open cscchris opened this issue 5 years ago • 1 comments

For this dataset: 2674.8060304978917, 3371.1788109723193, 2657.161969121835, 2814.5583226655367, 3290.855749923403, 3103.622791045206, 3403.2011487950185, 2841.438925235243, 2995.312700153925, 3256.4042898633224, 2609.8702933486843, 3214.6409110870877, 2952.1736018157644, 3468.7045537306344, 3260.9227206904898, 2645.5024256492215, 3137.857549381811, 3311.3526531674556, 2929.7762119375716, 2846.05991810631, 2606.47822546165, 3174.9770937667918, 3140.910443979614, 2590.6601484185085, 3123.4299821259915, 2714.4060964141136, 3133.9561758319487, 2951.3288157912752, 2860.3114228342765, 2757.4279640677833

the next 6 data points in real life are: 3147.816496825682, 3418.2300802476093, 2856.905414401418, 3419.0312162705545, 3307.9803365878442, 3527.68377555284

Note that there is a slight trend up in the dataset. I would expect a little better result with d = 1.

predict the next 6 points:

use p = 2, d = 0, all others = 0, confidence = 0.8:

RMSE: 199.8163163213122 3079.5652126415816, 3018.357601612911, 2972.5923804575086, 2995.670261137454, 2998.568039880799, 2993.4644978016477

if I use p =2, d =1, all others = 0, confidence = 0.8:

RMSE: 253.6211530852703 2886.970570844559, 2835.4065710542895, 2825.3444795390224, 2862.835372496484, 2843.664918178257, 2848.710910931624

So, with d =1, it is about 20% worse measured by RMSE.

p =3, d = 0, confidence = 0.8:

RMSE: 215.41839087831758 3084.2187112556344, 3008.4402233252013, 2979.552888778218, 2995.6062055652237, 2996.5840132883823, 2994.1306487277516

p = 3, d = 1, confidence = 0.8:

RMSE: 290.36910432429397 2954.849845119037, 2867.9992817121565, 2862.9013638112624, 2878.2804776911416, 2896.2693454378027, 2880.143205617359

The result is about 40% worse with d = 1 measured by RMSE.

May 16 '19 20:05 cscchris

Hello Chris,

Thanks for using this library. About your question, there are a couple of points to consider 1] The number of training data points are important for prediction quality. In your example case, there are 30 training data points and you requested to predict 6 future data points. That is on the stretch side of model prediction quality. Given the model requires 7 parameters, 30 training data points are not enough to make strong predictions. 2] In practice, you would need to perform hyper-parameter tuning for those parameters; p,d,q,P,D,Q,m in order to find the best parameters for your dataset. When I ran the test with your dataset, I find that ( p,d,q,P,D,Q,m)=(0,1,2,2,0,2,3) gave best RMSE score. For this particular example, I would try to give more training data points to the model in order to get a more meaningful model prediction.

Thanks,

May 20 '19 00:05 yonseokim

timeseries-forecast timeseries-forecast copied to clipboard

d parameter does not work well

timeseries-forecast
timeseries-forecast copied to clipboard