How to change period in `features(var, feat_stl)`

Open Aariq opened this issue 2 years ago • 2 comments

I can't seem to figure out how to add arguments to feat_stl when it is used inside of features(). I would expect an anonymous function to work, but it doesn't.


#> Attaching package: 'tsibble'
#> The following objects are masked from 'package:base':
#>     intersect, setdiff, union

tourism |> 
  model(STL(Trips ~ season("1 year")))
#> # A mable: 304 x 4
#> # Key:     Region, State, Purpose [304]
#>    Region         State              Purpose  `STL(Trips ~ season("1 year"))`
#>    <chr>          <chr>              <chr>                            <model>
#>  1 Adelaide       South Australia    Business                           <STL>
#>  2 Adelaide       South Australia    Holiday                            <STL>
#>  3 Adelaide       South Australia    Other                              <STL>
#>  4 Adelaide       South Australia    Visiting                           <STL>
#>  5 Adelaide Hills South Australia    Business                           <STL>
#>  6 Adelaide Hills South Australia    Holiday                            <STL>
#>  7 Adelaide Hills South Australia    Other                              <STL>
#>  8 Adelaide Hills South Australia    Visiting                           <STL>
#>  9 Alice Springs  Northern Territory Business                           <STL>
#> 10 Alice Springs  Northern Territory Holiday                            <STL>
#> # … with 294 more rows

tourism |> 
  features(Trips, feat_stl)
#> # A tibble: 304 × 12
#>    Region  State Purpose trend…¹ seaso…² seaso…³ seaso…⁴ spiki…⁵ linea…⁶ curva…⁷
#>    <chr>   <chr> <chr>     <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
#>  1 Adelai… Sout… Busine…   0.464   0.407       3       1 1.58e+2  -5.31   71.6  
#>  2 Adelai… Sout… Holiday   0.554   0.619       1       2 9.17e+0  49.0    78.7  
#>  3 Adelai… Sout… Other     0.746   0.202       2       1 2.10e+0  95.1    43.4  
#>  4 Adelai… Sout… Visiti…   0.435   0.452       1       3 5.61e+1  34.6    71.4  
#>  5 Adelai… Sout… Busine…   0.464   0.179       3       0 1.03e-1   0.968  -3.22 
#>  6 Adelai… Sout… Holiday   0.528   0.296       2       1 1.77e-1  10.5    24.0  
#>  7 Adelai… Sout… Other     0.593   0.404       2       2 4.44e-4   4.28    3.19 
#>  8 Adelai… Sout… Visiti…   0.488   0.254       0       3 6.50e+0  34.2    -0.529
#>  9 Alice … Nort… Busine…   0.534   0.251       0       1 1.69e-1  23.8    19.5  
#> 10 Alice … Nort… Holiday   0.381   0.832       3       1 7.39e-1 -19.6    10.5  
#> # … with 294 more rows, 2 more variables: stl_e_acf1 <dbl>, stl_e_acf10 <dbl>,
#> #   and abbreviated variable names ¹​trend_strength, ²​seasonal_strength_year,
#> #   ³​seasonal_peak_year, ⁴​seasonal_trough_year, ⁵​spikiness, ⁶​linearity,
#> #   ⁷​curvature

#doesn't work
tourism |> 
  features(Trips, ~feat_stl(., .period = "1 year"))
#> Error in `squash()`:
#> ! Only lists can be spliced

#> Backtrace:
#>     ▆
#>  1. ├─fabletools::features(tourism, Trips, ~feat_stl(., .period = "1 year"))
#>  2. ├─fabletools:::features.tbl_ts(tourism, Trips, ~feat_stl(., .period = "1 year"))
#>  3. │ └─fabletools:::features_impl(.tbl, list(.var), features, ...)
#>  4. │   ├─fabletools:::map(squash(features), as_function)
#>  5. │   │ └─base::lapply(.x, .f, ...)
#>  6. │   └─rlang::squash(features)
#>  7. └─rlang::abort(message = message)

Created on 2022-10-25 with reprex v2.0.2

Aariq avatar Oct 25 '22 14:10 Aariq

Thanks for raising this, I think this should be made to work better.

Essentially the features() function expects a function or list of functions. ~feat_stl(., .period = "1 year") is a formula representing a lambda function, and the current method doesn't (yet) know how to handle it.

Instead you could wrap it in a list, which is the recommended approach for providing one or more features.

tourism |> 
  features(Trips, list(~feat_stl(., .period = "1 year")))

However this runs into a secondary issue in that .period is not set up to handle the "1 year" syntax (yet).

So ultimately, what works is:

tourism |> 
  features(Trips, list(~feat_stl(., .period = 4)))

9a1e6fd90d37a9aa5be206e1f18218853af5d445 now supports lambda functions being used directly.

f8a8baab1e40f2978958fd02c35f42a43abc0cd2 allows you to use .period = "1 year" in features, but not directly in the feature functions like you are doing. You can change the .period that is passed to all features by using the ... of features().

So I would recommend (with these latest changes) for your example:

tourism |> 
  features(Trips, feat_stl, .period = "1 year")
#> # A tibble: 304 × 12
#>    Region  State Purpose trend…¹ seaso…² seaso…³ seaso…⁴ spiki…⁵ linea…⁶ curva…⁷
#>    <chr>   <chr> <chr>     <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
#>  1 Adelai… Sout… Busine…   0.464   0.407       3       1 1.58e+2  -5.31   71.6  
#>  2 Adelai… Sout… Holiday   0.554   0.619       1       2 9.17e+0  49.0    78.7  
#>  3 Adelai… Sout… Other     0.746   0.202       2       1 2.10e+0  95.1    43.4  
#>  4 Adelai… Sout… Visiti…   0.435   0.452       1       3 5.61e+1  34.6    71.4  
#>  5 Adelai… Sout… Busine…   0.464   0.179       3       0 1.03e-1   0.968  -3.22 
#>  6 Adelai… Sout… Holiday   0.528   0.296       2       1 1.77e-1  10.5    24.0  
#>  7 Adelai… Sout… Other     0.593   0.404       2       2 4.44e-4   4.28    3.19 
#>  8 Adelai… Sout… Visiti…   0.488   0.254       0       3 6.50e+0  34.2    -0.529
#>  9 Alice … Nort… Busine…   0.534   0.251       0       1 1.69e-1  23.8    19.5  
#> 10 Alice … Nort… Holiday   0.381   0.832       3       1 7.39e-1 -19.6    10.5  
#> # … with 294 more rows, 2 more variables: stl_e_acf1 <dbl>, stl_e_acf10 <dbl>,
#> #   and abbreviated variable names ¹​trend_strength, ²​seasonal_strength_4,
#> #   ³​seasonal_peak_4, ⁴​seasonal_trough_4, ⁵​spikiness, ⁶​linearity, ⁷​curvature


tourism |> 
  features(Trips, ~feat_stl(., .period = 4))
#> # A tibble: 304 × 12
#>    Region  State Purpose trend…¹ seaso…² seaso…³ seaso…⁴ spiki…⁵ linea…⁶ curva…⁷
#>    <chr>   <chr> <chr>     <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
#>  1 Adelai… Sout… Busine…   0.464   0.407       3       1 1.58e+2  -5.31   71.6  
#>  2 Adelai… Sout… Holiday   0.554   0.619       1       2 9.17e+0  49.0    78.7  
#>  3 Adelai… Sout… Other     0.746   0.202       2       1 2.10e+0  95.1    43.4  
#>  4 Adelai… Sout… Visiti…   0.435   0.452       1       3 5.61e+1  34.6    71.4  
#>  5 Adelai… Sout… Busine…   0.464   0.179       3       0 1.03e-1   0.968  -3.22 
#>  6 Adelai… Sout… Holiday   0.528   0.296       2       1 1.77e-1  10.5    24.0  
#>  7 Adelai… Sout… Other     0.593   0.404       2       2 4.44e-4   4.28    3.19 
#>  8 Adelai… Sout… Visiti…   0.488   0.254       0       3 6.50e+0  34.2    -0.529
#>  9 Alice … Nort… Busine…   0.534   0.251       0       1 1.69e-1  23.8    19.5  
#> 10 Alice … Nort… Holiday   0.381   0.832       3       1 7.39e-1 -19.6    10.5  
#> # … with 294 more rows, 2 more variables: stl_e_acf1 <dbl>, stl_e_acf10 <dbl>,
#> #   and abbreviated variable names ¹​trend_strength, ²​seasonal_strength_4,
#> #   ³​seasonal_peak_4, ⁴​seasonal_trough_4, ⁵​spikiness, ⁶​linearity, ⁷​curvature

Created on 2022-10-27 by the reprex package (v2.0.1)

mitchelloharawild avatar Oct 27 '22 12:10 mitchelloharawild

Thanks for the quick updates. Having examples like these in the documentation for maybe features() or feat_stl() would also be helpful.

Aariq avatar Oct 27 '22 14:10 Aariq