fabletools icon indicating copy to clipboard operation
fabletools copied to clipboard

Add check_residuals() function

Open robjhyndman opened this issue 5 years ago • 12 comments

Essentially a wrapper to

  augment(model) %>%
  features(.resid, ljung_box, lag=<10 or 2*period>, dof=<from model>)

robjhyndman avatar Aug 07 '19 01:08 robjhyndman

Or maybe this should be called test_residuals().

robjhyndman avatar Aug 07 '19 01:08 robjhyndman

Certainly possible, this would also require models to add the dof to the glance output or similar (much like forecast:::modeldf()). period can be determined from the tsibble.

I think the interface needs more thought to ensure that a consistent and general interface is preserved throughout the package.

  • Checking/testing the residuals would often involve more than just the Ljung-Box test. Should there be a tag which is used for model testing?
  • How would you specify the tests that they are interested in?
  • What other parameters from the model fit may be useful when computing a feature?
  • Are there other functions which should wrap features()? Should this be common practice?

mitchelloharawild avatar Aug 07 '19 01:08 mitchelloharawild

Adding dof to the glance output seems like a good idea in any case. Yes, making it more general might be a good idea, although the use in the textbook is almost always LB apart from regression models where LB is controversial and Breusch-Godfrey is sometimes preferred.

robjhyndman avatar Aug 07 '19 02:08 robjhyndman

This would be great for mables where each model has a different dof.

mbg-unsw avatar Mar 10 '21 21:03 mbg-unsw

I've now added (experimentally) hypothesize() methods in fabletools (0f3c42f6c6e1aa837de3ca5385447387bbdc1f48) for running statistical tests on fitted models. It is very similar to features(), but more oriented to computing tests on fitted models, rather than features on data. Note that tests can be features, but not the other way round. Note that hypothesise() will be available once https://github.com/r-lib/generics/issues/55 is resolved. As an example of how this function works, I have also added breusch_godfrey() in https://github.com/tidyverts/fable/commit/0f3c42f6c6e1aa837de3ca5385447387bbdc1f48 which can be used as follows:

library(fpp3)
tourism %>% 
  model(TSLM(Trips ~ trend() + season())) %>% 
  hypothesize(tests = lst(breusch_godfrey), order = 24)
#> # A tibble: 304 x 9
#>    Region   State    Purpose .model     .test  statistic order null_dist p.value
#>    <chr>    <chr>    <chr>   <chr>      <chr>      <dbl> <int>    <dist>   <dbl>
#>  1 Adelaide South A… Busine… TSLM(Trip… breus…      23.3    24    ᵪ²(24)  0.500 
#>  2 Adelaide South A… Holiday TSLM(Trip… breus…      26.1    24    ᵪ²(24)  0.346 
#>  3 Adelaide South A… Other   TSLM(Trip… breus…      34.7    24    ᵪ²(24)  0.0732
#>  4 Adelaide South A… Visiti… TSLM(Trip… breus…      29.7    24    ᵪ²(24)  0.194 
#>  5 Adelaid… South A… Busine… TSLM(Trip… breus…      24.8    24    ᵪ²(24)  0.414 
#>  6 Adelaid… South A… Holiday TSLM(Trip… breus…      24.1    24    ᵪ²(24)  0.458 
#>  7 Adelaid… South A… Other   TSLM(Trip… breus…      25.5    24    ᵪ²(24)  0.377 
#>  8 Adelaid… South A… Visiti… TSLM(Trip… breus…      12.1    24    ᵪ²(24)  0.979 
#>  9 Alice S… Norther… Busine… TSLM(Trip… breus…      26.5    24    ᵪ²(24)  0.327 
#> 10 Alice S… Norther… Holiday TSLM(Trip… breus…      30.7    24    ᵪ²(24)  0.163 
#> # … with 294 more rows

Created on 2021-04-08 by the reprex package (v1.0.0)

I do think that it should be easy to compute both Ljung-Box and Breusch-Godfrey tests on regression models, and at most it should hint toward Breusch-Godfrey for regression models in the documentation.

mitchelloharawild avatar Apr 08 '21 04:04 mitchelloharawild

Thanks, that looks great.

I assume we'll also need new Ljung-Box and Breusch-Godfrey methods for ARIMA that can pick up the dof from each model?

mbg-unsw avatar Apr 09 '21 00:04 mbg-unsw

Ljung-Box and Box-Pierce tests will be written to work with any model that makes the degrees of freedom available. This will be the next one to add, however it will require some migration of feasts::ljung_box() to fabletools::ljung_box().

mitchelloharawild avatar Apr 09 '21 00:04 mitchelloharawild

@mitchelloharawild I've updated my fabletools package and am unable to run the breusch_godfrey on my TSLM.

remotes::install_github("tidyverts/fabletools")

fit_trend <- q1_ts %>% mutate(surfing_festival = ifelse(month(month)==3 & year(month) > 1987,1,0)) %>% model(exponential = TSLM(log(sales)~ trend() + season() + surfing_festival)) report(fit_trend)

I've tried: fit_trend %>% hypothesise(tests = lst(breusch_godfrey), order = 24) Error in hypothesise(., tests = lst(breusch_godfrey), order = 24) : could not find function "hypothesise"

and: fable::breusch_godfrey(fit_trend) Error in UseMethod("breusch_godfrey") : no applicable method for 'breusch_godfrey' applied to an object of class "c('mdl_df', 'tbl_df', 'tbl', 'data.frame')"

Any guidance would be appreciated.

baumstan avatar Jul 18 '21 17:07 baumstan

Looks like you haven't yet loaded the development version. Try restarting R to unload the CRAN version of fabletools so that next time you load the fabletools package, you will have the dev version and access to these new functions.

mitchelloharawild avatar Jul 19 '21 01:07 mitchelloharawild

Thank you. I'd loaded but not restarted. This code works:

fit_trend %>%
  hypothesize(tests = lst(breusch_godfrey), order = 1)

But this one doesn't...

fable::breusch_godfrey(fit_trend, order =1)

Could you confirm that I've correctly used the hypothesize option given that my model is a regression not an ARIMA?

baumstan avatar Jul 19 '21 01:07 baumstan

Yes, the first code snippet is the current interface for running the test.

mitchelloharawild avatar Jul 19 '21 01:07 mitchelloharawild

An alternative generic function is needed for computing values from distributions, such as Newey-West (https://github.com/tidyverts/fable/issues/332). The function could/would act very similarly to what we have described here.

mitchelloharawild avatar Aug 26 '21 02:08 mitchelloharawild