modeltime
modeltime copied to clipboard
extract_nested_future_forecast() fails when adding multiple predictors in recipe() with +
See reprex below. When adding a single predictor via recipe(y ~ x)
or using all predictors via recipe(y ~ .)
I am able to get future predictions from nested models using the extract_nested_future_forecast()
function. When adding multiple via recipe(y ~ x1 + x2), it does not provide an error, but the extract function returns a 0x0 tibble.
This works
library(gapminder)
library(tidyverse)
library(modeltime)
library(tidymodels)
library(lubridate)
# Add a date component to the data frame
gapminder_tbl <- gapminder %>%
group_by(country) %>%
mutate(Date = ymd(paste(year,01,01,sep = "-"))) %>%
filter(continent %in% c("Oceania")) # avoid having too much data for the example
# Nest each country and extend the time series
nested_data_gapminder_tbl <- gapminder_tbl %>%
# Step 1: Extend the time series by country
extend_timeseries(
.id_var = country,
.date_var = Date,
.length_future = 5 # Extend by five years
) %>%
# Step 2: Nests the time series into .actual_data and .future_data
nest_timeseries(
.id_var = country,
.length_future = 5
) %>%
# Step 3: Adds a column .splits that contains training/testing indicies
split_nested_timeseries(
.length_test = 4
)
# Create the recipe for prediction
rec_prohpet <- recipe(lifeExp ~ Date,
data = extract_nested_train_split(nested_data_gapminder_tbl))
# Create workflow object
wflw_prophet <- workflow() %>%
add_model(
prophet_reg("regression", seasonality_yearly = TRUE) %>%
set_engine("prophet" )
) %>%
add_recipe(rec_prohpet)
# Fit the model to the nested data
nested_modeltime_tbl <- modeltime_nested_fit(
nested_data = nested_data_gapminder_tbl,
wflw_prophet
)
# Model performance
nested_modeltime_tbl %>%
extract_nested_test_accuracy() %>%
table_modeltime_accuracy(.interactive = FALSE)
# Refit to entire dataset and predict again
preds<- nested_modeltime_refit_tbl <- nested_modeltime_tbl %>%
modeltime_nested_refit(
control = control_nested_refit(verbose = TRUE)
)
# Extract predictions
preds<- nested_modeltime_refit_tbl %>%
extract_nested_future_forecast()
This does not work as expected:
library(gapminder)
library(tidyverse)
library(modeltime)
library(tidymodels)
library(lubridate)
# Add a date component to the data frame
gapminder_tbl <- gapminder %>%
group_by(country) %>%
mutate(Date = ymd(paste(year,01,01,sep = "-"))) %>%
filter(continent %in% c("Oceania")) # avoid having too much data for the example
# Nest each country and extend the time series
nested_data_gapminder_tbl <- gapminder_tbl %>%
# Step 1: Extend the time series by country
extend_timeseries(
.id_var = country,
.date_var = Date,
.length_future = 5 # Extend by five years
) %>%
# Step 2: Nests the time series into .actual_data and .future_data
nest_timeseries(
.id_var = country,
.length_future = 5
) %>%
# Step 3: Adds a column .splits that contains training/testing indicies
split_nested_timeseries(
.length_test = 4
)
# Create the recipe for prediction
rec_prohpet <- recipe(lifeExp ~ Date + pop,
data = extract_nested_train_split(nested_data_gapminder_tbl))
# Create workflow object
wflw_prophet <- workflow() %>%
add_model(
prophet_reg("regression", seasonality_yearly = TRUE) %>%
set_engine("prophet" )
) %>%
add_recipe(rec_prohpet)
# Fit the model to the nested data
nested_modeltime_tbl <- modeltime_nested_fit(
nested_data = nested_data_gapminder_tbl,
wflw_prophet
)
# Model performance
nested_modeltime_tbl %>%
extract_nested_test_accuracy() %>%
table_modeltime_accuracy(.interactive = FALSE)
# Refit to entire dataset and predict again
preds<- nested_modeltime_refit_tbl <- nested_modeltime_tbl %>%
modeltime_nested_refit(
control = control_nested_refit(verbose = TRUE)
)
# Extract predictions
preds<- nested_modeltime_refit_tbl %>%
extract_nested_future_forecast()
The reason why I think the issue is with extract_nested_future_forecast()
is that
nested_modeltime_tbl_imp %>%
extract_nested_test_accuracy() %>%
table_modeltime_accuracy(.interactive = T)
Provides a table as expected. So the models get trained, it's just not possible to extract the future forecasts.
I too have this issue
Hi,
I can reproduce your results, but I think there is not any problem.
Both recipe(lifeExp ~ . )
and recipe(lifeExp ~ Date + pop)
returns a 0x0 tibble.
The problem is that you extending your time series with the extend_timeseries()
function and the xregs get NA values in the future table. So, when you are going to predict using this table and using xregs, you don't get any prediction because you don't have any xreg.
You need to use a left_join()
to give values to your xregs in your future table to be able to get the predictions.
Regards,
Ahh,. of course, that makes sense @AlbertoAlmuinha - thanks! I'll close the issue
Hi everyone, I'm having the same issue. I know this topic is closed, but perhaps you could show where you left_join the xregs?
I love modeltime and use it for all of my time-series forecasts. I used nested and global forecasts all the time, and now I'd like to add xregs to my nested forecasts. If anyone could help on this exact topic, I could figure it out from there.
Could you explain this solution in a bit more detail? Im running into similar issue and cant really come to a resolution from these responses to date. thanks!
@AlbertoAlmuinha @rlohne I would also appreciate more information on what objects need to be joined.
@mgree013 and @nrjenkins:
if you use xregs in the model, they need to be present in the future data as well. In the example, pop (population) is used as an external regressor:
# Create the recipe for prediction
rec_prohpet <- recipe(lifeExp ~ Date + pop,
data = extract_nested_train_split(nested_data_gapminder_tbl))
However, in the future forecasting dataset, values for the population of the oceanic countries are not present, thus the model fails.
The screenshot above illustrates this, on the left is the data.frame, where Date + pop is the data the model is trained on, and on the right is the future data.frame, where you can see that there are no values for pop.
So the solution would be to predict those values, and then left_join those predictions to the future data.frame. This is however, where things get messy because you are creating forecasts for forecasting xregs to be used in another forecast..