Observation Weights
Allow the user to specify observation weights for the past which can be used to discount early periods.
Just wondering if this would this also allow for inverse-variance weighting? And if not, is some sort of observational weighting in the works?
Are there any updates on the problem?
What is the status of this enhancement? Its a really useful feature which I would like to see in the package
This would indeed be a very welcome feature. In the meantime, is there any way one could add an additional column ( = an additional regressor) that could simulate this?
I had the similar problem and got a little better result with additional column/regressor. (mape 0.274 is reduced to 0.256 in cross validation results) My input dataframe is created with hourly time series traffic data and I prepared a simple linearly increasing day count weight and applied it as multiplicative regressor as below. Any comments?
df['recency'] = df.index // 24 ... m.add_regressor('recency', prior_scale=10, mode='multiplicative') ... future["recency"] = future.index // 24
@bletham How would the authors propose handling periodic data where sample size n varies markedly from period to period? Can I get the desired effect by repeating each weekly observation n times in the input file? will the confidence intervals be correct?
@numeric-lee to understand better - you have weekly data, and for each week there are multiple observations. It sounds like these are being averaged? And the sample size is the number of points being included in that average, which is different each week?
Adding each point in n times would be equivalent to weighting the observations appropriately. It might be computationally slow though depending on how big n is.
@bletham Thanks, it seems to be an adequate workaround. is there an alternative (a weighting vector) which would be computationally faster that exists today, anything on the horizon?
I think that if you wanted to prototype this in a local fork for your current use case, it actually wouldn't be too involved. You would add the weights into the likelihood in the Stan model here:
https://github.com/facebook/prophet/blob/e41ed25646f44f713c110c30c07c678e4a07728e/python/stan/unix/prophet.stan#L140
(sigma_obs here is the standard deviation, so you would replace that with the standard error, which is sigma_obs / w, where the weight w would be set to sqrt(n)).
You then just need to pass vector[T] w into the Stan model in this block:
https://github.com/facebook/prophet/blob/e41ed25646f44f713c110c30c07c678e4a07728e/python/stan/unix/prophet.stan#L87
and then add it to the Python or R where you pass the data over to Stan (I'll show Py here, but it's parallel for R). You would add something like
'w': np.sqrt(history['n'].values)
to the dictionary here: https://github.com/facebook/prophet/blob/e41ed25646f44f713c110c30c07c678e4a07728e/python/fbprophet/forecaster.py#L1124
You would then install this local version, and then if you include n as a column in the dataframe that you pass into Prophet, it should work.
Thanks so much, Ben!
I had the similar problem and got a little better result with additional column/regressor. (mape 0.274 is reduced to 0.256 in cross validation results) My input dataframe is created with hourly time series traffic data and I prepared a simple linearly increasing day count weight and applied it as multiplicative regressor as below. Any comments?
df['recency'] = df.index // 24 ... m.add_regressor('recency', prior_scale=10, mode='multiplicative') ... future["recency"] = future.index // 24
@BryanKoo How does your df and future index look like -- A or B?
Here is the 'df' :
df
index | ds | recency | y 0 | 1/1/2021 | 0 | 10 1 | 2/1/2021 | 0.041667 | 20 2 | 3/1/2021 | 0.083333 | 30 3 | 4/1/2021 | 0.125 | 40
future dataframe : A
A : future
index | ds | recency 0 | 5/1/2021 | 0 1 | 6/1/2021 | 0.041667 2 | 7/1/2021 | 0.083333 3 | 8/1/2021 | 0.125
future dataframe : B
B: future
index | ds | recency 4 | 5/1/2021 | 0.166667 5 | 6/1/2021 | 0.208333 6 | 7/1/2021 | 0.25 7 | 8/1/2021 | 0.291667
TIA..
I am just feeling let down that prophet doesnt support something as basic as this..... In my problem, the old data are still useful... but I don't want give a lot of weights to their fitting.... So the regressor coefficients are influenced more by recent points than older..