fable
fable copied to clipboard
Re-consider IC output for refitted models
Using the latest github version of fable here (but issue is also present on CRAN version).
library(tsibble)
library(fable)
#> Loading required package: fabletools
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
lung_deaths <- as_tsibble(mdeaths)
# Fit ARIMA on first part of timeseries.
ts1 <- lung_deaths %>% filter(index <= as.Date("1977-01-01"))
#> Warning in mask$eval_all_filter(dots, env_filter): Incompatible methods
#> ("<=.vctrs_vctr", "<=.Date") for "<="
fit <- ts1 %>%
model(ARIMA(value ~ 1 + pdq(1,0,0) + PDQ(0,0,0)))
# Refit ARIMA with one more observation
ts2 <- lung_deaths %>% filter(index > as.Date("1977-01-01"))
#> Warning in mask$eval_all_filter(dots, env_filter): Incompatible methods
#> (">.vctrs_vctr", ">.Date") for ">"
fit %>%
refit(ts2 %>% head(1)) %>%
report()
#> Series: value
#> Model: ARIMA(1,0,0) w/ mean
#>
#> Coefficients:
#> ar1 constant
#> 0.7851 364.8801
#> s.e. 0.1030 43.9553
#>
#> sigma^2 estimated as 1579: log likelihood=-5.58
#> AIC=13.16 AICc=9.16 BIC=11.16
# Refit ARIMA with two more observations
fit %>%
refit(ts2 %>% head(2)) %>%
report()
#> Warning: It looks like you're trying to fully specify your ARIMA model but have not said if a constant should be included.
#> You can include a constant using `ARIMA(y~1)` to the formula or exclude it by adding `ARIMA(y~0)`.
#> Error: Problem with `mutate()` input `ARIMA(value ~ 1 + pdq(1, 0, 0) + PDQ(0, 0, 0))`.
#> x Could not find an appropriate ARIMA model.
#> This is likely because automatic selection does not select models with characteristic roots that may be numerically unstable.
#> For more details, refer to https://otexts.com/fpp3/arima-r.html#plotting-the-characteristic-roots
#> ℹ Input `ARIMA(value ~ 1 + pdq(1, 0, 0) + PDQ(0, 0, 0))` is `(function (object, ...) ...`.
Created on 2020-05-18 by the reprex package (v0.3.0)
@robjhyndman
This is coming from here (which comes from forecast::auto.arima())
https://github.com/tidyverts/fable/blob/64b36c6df3285e33c912991a1cc885604bbb3158/R/arima.R#L153
Any reason why npar is the number of estimated coefficients +1?
Also @TimothyHyndman, when you say "Refit ARIMA with two more observations", you're refitting the model with only two observations (not two more observations). You'll need to refit the model with the historical data and the next two observations (or use stream.ARIMA once #251 is done).
npar = # coefficients + 1 to account for the residual variance. This is how the AIC/AICc/BIC defines # parameters in a model.
Got it, thanks. https://github.com/tidyverts/fable/blob/64b36c6df3285e33c912991a1cc885604bbb3158/R/arima.R#L159
Then in this case (n=2, d=0, D=0), nstar = 2 giving division by (nstar - npar - 1) = 0. Is this correct?
It is problematic computing AICc on a refit because the parameters were not estimated on that data set. If the data was used for estimation, then the equation is correct. But you would expect nstar to always be bigger than npar+1 or you would be over-fitting.
I'm not sure what we should return as AICc value on a refit -- possibly the original AICc on the original data, or perhaps NA or NULL.
Also @TimothyHyndman, when you say "Refit ARIMA with two more observations", you're refitting the model with only two observations (not two more observations). You'll need to refit the model with the historical data and the next two observations (or use stream.ARIMA once #251 is done).
Ah, I was thinking that refit used the data contained in fit in addition to the data passed in with new_data = . Thanks for the correction.