fabletools icon indicating copy to clipboard operation
fabletools copied to clipboard

Modeling an empty tsibble (with key) should not error

Open Fuco1 opened this issue 3 years ago • 2 comments

In batch processing consistency is king. Sometimes we might apply filter on a dataset and remove all the data. Still calling model on such a tsibble should, in my opinion, not produce an error but return an empty mable. This allows us to write simple and consistent pipelines without unnecessary error checking.

This works:

> tsibble(i = 1:12, quantity = 1, index = "i") %>% filter(i > 1) %>% model(x = MEAN(quantity))
# A mable: 1 x 1
        x
  <model>
1  <MEAN>
> tsibble(i = 1:12, quantity = 1, index = "i") %>% filter(i > 12) %>% model(x = MEAN(quantity))
# A mable: 1 x 1
             x
       <model>
1 <NULL model>
Warning message:
1 error encountered for x
[1] All observations are missing, a model cannot be estimated without data.

but having a tsibble with key fails:

tsibble(i = 1:12, k = 1, quantity = 1, index = "i", key = "k") %>% filter(i > 12) %>% model(x = MEAN(quantity))
Error in `[[.default`(mdl, 1) : subscript out of bounds
> 

Fuco1 avatar May 06 '21 23:05 Fuco1

With the current release versions of tsibble, feasts, fable and fabletools, I am getting the error for both the second and third lines above, rather than a <NULL model> for the second.

> tsibble(i = 1:12, quantity = 1, index = "i") %>% filter(i > 12) %>% model(x = MEAN(quantity))
Error in .data[[index_var(.data)]][[1]] : subscript out of bounds

Perhaps a recent change to fabletools is keeping the second example from getting through to the fable::MEAN(), which creates the <NULL model> in response to:

if (all(is.na(y))) {
    abort("All observations are missing, a model cannot be estimated without data.")  
}

It seems like fabletools::estimate() might be a decent place for a check that calls null_model() in the face of no data, as I cannot conceive of a model that will fit without data, but my lack of imagination might be showing itself.

davidtedfordholt avatar Jun 09 '21 15:06 davidtedfordholt

This should return a zero-row mable, but would require some restructuring of how and where some attributes like response variable are stored. Currently the response variables are grabbed from the estimated models, but in a zero-row mable there are no models to grab from.

mitchelloharawild avatar Mar 02 '24 04:03 mitchelloharawild