fable
fable copied to clipboard
Feature Request: Automatic K optimization for Fourier Terms
If I wish to fit a regression with Fourier terms then to find the optimal K I need to do something like this:
library(fable)
library(dplyr)
library(tidyr)
mbl = tsibbledata::ansett %>%
tsibble::fill_gaps() %>%
model(arima1 = ARIMA(Passengers ~ fourier(K = 1) + PDQ(0,0,0)),
arima2 = ARIMA(Passengers ~ fourier(K = 2) + PDQ(0,0,0)),
arima3 = ARIMA(Passengers ~ fourier(K = 3) + PDQ(0,0,0)))
metrics = mbl %>%
glance()
mbl_best = metrics %>%
select(Airports, Class, .model, AICc) %>%
group_by(Airports, Class) %>%
slice(which.min(AICc)) %>%
left_join(mbl %>%
gather('.model', 'model', -Airports, -Class),
by = c('.model', 'Airports', 'Class')) %>%
as_mable(key = c('Airports', 'Class'), models = 'model')
It would be more convenient for K to be automatically determined through something like this:
model(arima = ARIMA(Passengers ~ Fourier(K = 1:3) + PDQ(0,0,0)
On that note, when I look at the source code for ARIMA it appears that when fitting a regression + ARIMA the number of differences is determined after the regression. Because of this, it seems entirely possible that the arima1, arima2 and arima3 models I fit could potentially have a different number of differencing. If this is indeed the case perhaps determining K through cross validation is better?
Thanks!
Automating the choice of K could be a feature we look at in a future release. It is very unlikely to affect the order of differencing, so I think using AICc for selection is safe enough.
This is something which will need to be added on a model by model basis, as each model will have different methods of model selection.
Could we iteratively select the best K based on the whatever criteria is used in the base model? My idea is to fit fourier series of different K linearly to the response and select the one with the best criteria measure as passed by the base model. Is there a case where we wouldn't want to fit it linearly? I'll admit that re-estimation after fitting the rest of the model would be good, but that this might provided directionality for the user that doesn't know which K to select.
As in interim solution, I am trying to fit multiple moders in a loop manner so that I do not have to repeat the formula so many times.
Yet I am struggling (I do not have that much of a background in tidy R).
Could you help?