fabletools
fabletools copied to clipboard
Mutate() issue - Hierarchical Forecasting
Hi, I am reposting this issue on GitHub, with a more complete example, as I suspect it might not be related to the data being used or code mistakes.
I am trying to perform Hierarchical Forecasting on a dataset that is fundamentally structured in the same way as the tourism
tsibble
referenced in Forecasting: Principles and Practice, but with more hierarchical levels. However, after the structural aggregation, a mutate() error shows up.
The data doesn't contain any missing values.
Following, you will find a reprex
of the code, containing a minimal version of the data used that is able to reproduce the error.
Thanks in advance.
library(fable)
library(dplyr)
library(tsibble)
library(tidyverse)
t_london <- tibble::tribble(
~Month, ~Value.type, ~LSOA11CD, ~LSOA11NM, ~WD19CD, ~WD19NM, ~LAD19CD, ~LAD19NM, ~CTYNM, ~RGN19NM, ~CNTY21NM, ~NTN21NM, ~Count,
"2010 Dec", "Value-Type-1 ", "E01000001", "City of London 001A", "E05009288", "Aldersgate", "E09000001", "City of London", "City Of London", "London", "England", "UK", 2L,
"2011 Jan", "Value-Type-1 ", "E01000001", "City of London 001A", "E05009288", "Aldersgate", "E09000001", "City of London", "City Of London", "London", "England", "UK", 2L,
"2011 Feb", "Value-Type-1 ", "E01000001", "City of London 001A", "E05009288", "Aldersgate", "E09000001", "City of London", "City Of London", "London", "England", "UK", 3L,
"2011 Mar", "Value-Type-1 ", "E01000001", "City of London 001A", "E05009288", "Aldersgate", "E09000001", "City of London", "City Of London", "London", "England", "UK", 2L,
"2011 Apr", "Value-Type-1 ", "E01000001", "City of London 001A", "E05009288", "Aldersgate", "E09000001", "City of London", "City Of London", "London", "England", "UK", 0L,
"2011 May", "Value-Type-1 ", "E01000001", "City of London 001A", "E05009288", "Aldersgate", "E09000001", "City of London", "City Of London", "London", "England", "UK", 2L,
"2011 Jun", "Value-Type-1 ", "E01000001", "City of London 001A", "E05009288", "Aldersgate", "E09000001", "City of London", "City Of London", "London", "England", "UK", 4L,
"2011 Jul", "Value-Type-1 ", "E01000001", "City of London 001A", "E05009288", "Aldersgate", "E09000001", "City of London", "City Of London", "London", "England", "UK", 3L,
"2011 Aug", "Value-Type-1 ", "E01000001", "City of London 001A", "E05009288", "Aldersgate", "E09000001", "City of London", "City Of London", "London", "England", "UK", 2L,
"2011 Sep", "Value-Type-1 ", "E01000001", "City of London 001A", "E05009288", "Aldersgate", "E09000001", "City of London", "City Of London", "London", "England", "UK", 0L,
"2011 Oct", "Value-Type-1 ", "E01000001", "City of London 001A", "E05009288", "Aldersgate", "E09000001", "City of London", "City Of London", "London", "England", "UK", 1L,
"2011 Nov", "Value-Type-1 ", "E01000001", "City of London 001A", "E05009288", "Aldersgate", "E09000001", "City of London", "City Of London", "London", "England", "UK", 1L,
"2011 Dec", "Value-Type-1 ", "E01000001", "City of London 001A", "E05009288", "Aldersgate", "E09000001", "City of London", "City Of London", "London", "England", "UK", 6L
)
t_london <- t_london %>%
mutate(Month = yearmonth(Month)) %>%
as_tsibble(key = c(LSOA11CD, Value.type), index=Month)
london_full <- t_london %>% aggregate_key((NTN21NM/ CNTY21NM / RGN19NM / CTYNM / LAD19NM / WD19NM /LSOA11NM) * Value.type, Total = sum(Count))
fit <- london_full %>%
model(base = ARIMA(Total)) %>%
reconcile(
bu = bottom_up(base),
ols = min_trace(base, method = "ols"),
mint = min_trace(base, method = "mint_shrink"),
)
#> Warning in max(which(abs(ma) > 1e-08)): no non-missing arguments to max;
#> returning -Inf
#> Warning: 16 errors (1 unique) encountered for base
#> [16] argument must be coercible to non-negative integer
fc <- fit %>%
forecast(h = 5)
#> Warning: Problem with `mutate()` input `mint`.
#> ℹ diag(.) had 0 or NA entries; non-finite result is doubtful
#> ℹ Input `mint` is `(function (object, ...) ...`.
#> Warning: Problem with `mutate()` input `mint`.
#> ℹ diag(.) had 0 or NA entries; non-finite result is doubtful
#> ℹ Input `mint` is `(function (object, ...) ...`.
#> Error: Problem with `mutate()` input `mint`.
#> x infinite or missing values in 'x'
#> ℹ Input `mint` is `(function (object, ...) ...`.
Created on 2021-02-10 by the reprex package (v0.3.0)
I am unable to reproduce this issue with the latest versions of the packages. Perhaps try updating to the latest CRAN releases?
library(fable)
#> Loading required package: fabletools
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(tsibble)
#>
#> Attaching package: 'tsibble'
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, union
library(tidyverse)
t_london <- tibble::tribble(
~Month, ~Value.type, ~LSOA11CD, ~LSOA11NM, ~WD19CD, ~WD19NM, ~LAD19CD, ~LAD19NM, ~CTYNM, ~RGN19NM, ~CNTY21NM, ~NTN21NM, ~Count,
"2010 Dec", "Value-Type-1 ", "E01000001", "City of London 001A", "E05009288", "Aldersgate", "E09000001", "City of London", "City Of London", "London", "England", "UK", 2L,
"2011 Jan", "Value-Type-1 ", "E01000001", "City of London 001A", "E05009288", "Aldersgate", "E09000001", "City of London", "City Of London", "London", "England", "UK", 2L,
"2011 Feb", "Value-Type-1 ", "E01000001", "City of London 001A", "E05009288", "Aldersgate", "E09000001", "City of London", "City Of London", "London", "England", "UK", 3L,
"2011 Mar", "Value-Type-1 ", "E01000001", "City of London 001A", "E05009288", "Aldersgate", "E09000001", "City of London", "City Of London", "London", "England", "UK", 2L,
"2011 Apr", "Value-Type-1 ", "E01000001", "City of London 001A", "E05009288", "Aldersgate", "E09000001", "City of London", "City Of London", "London", "England", "UK", 0L,
"2011 May", "Value-Type-1 ", "E01000001", "City of London 001A", "E05009288", "Aldersgate", "E09000001", "City of London", "City Of London", "London", "England", "UK", 2L,
"2011 Jun", "Value-Type-1 ", "E01000001", "City of London 001A", "E05009288", "Aldersgate", "E09000001", "City of London", "City Of London", "London", "England", "UK", 4L,
"2011 Jul", "Value-Type-1 ", "E01000001", "City of London 001A", "E05009288", "Aldersgate", "E09000001", "City of London", "City Of London", "London", "England", "UK", 3L,
"2011 Aug", "Value-Type-1 ", "E01000001", "City of London 001A", "E05009288", "Aldersgate", "E09000001", "City of London", "City Of London", "London", "England", "UK", 2L,
"2011 Sep", "Value-Type-1 ", "E01000001", "City of London 001A", "E05009288", "Aldersgate", "E09000001", "City of London", "City Of London", "London", "England", "UK", 0L,
"2011 Oct", "Value-Type-1 ", "E01000001", "City of London 001A", "E05009288", "Aldersgate", "E09000001", "City of London", "City Of London", "London", "England", "UK", 1L,
"2011 Nov", "Value-Type-1 ", "E01000001", "City of London 001A", "E05009288", "Aldersgate", "E09000001", "City of London", "City Of London", "London", "England", "UK", 1L,
"2011 Dec", "Value-Type-1 ", "E01000001", "City of London 001A", "E05009288", "Aldersgate", "E09000001", "City of London", "City Of London", "London", "England", "UK", 6L
)
t_london <- t_london %>%
mutate(Month = yearmonth(Month)) %>%
as_tsibble(key = c(LSOA11CD, Value.type), index=Month)
london_full <- t_london %>% aggregate_key((NTN21NM/ CNTY21NM / RGN19NM / CTYNM / LAD19NM / WD19NM /LSOA11NM) * Value.type, Total = sum(Count))
fit <- london_full %>%
model(base = ARIMA(Total)) %>%
reconcile(
bu = bottom_up(base),
ols = min_trace(base, method = "ols"),
mint = min_trace(base, method = "mint_shrink"),
)
fc <- fit %>%
forecast(h = 5)
fc
#> # A fable: 320 x 12 [1M]
#> # Key: NTN21NM, Value.type, CNTY21NM, RGN19NM, CTYNM, LAD19NM, WD19NM,
#> # LSOA11NM, .model [64]
#> NTN21NM Value.type CNTY21NM RGN19NM CTYNM LAD19NM WD19NM
#> <chr*> <chr*> <chr*> <chr*> <chr*> <chr*> <chr*>
#> 1 UK Value-Typ… England London City Of L… City of L… Aldersgate
#> 2 UK Value-Typ… England London City Of L… City of L… Aldersgate
#> 3 UK Value-Typ… England London City Of L… City of L… Aldersgate
#> 4 UK Value-Typ… England London City Of L… City of L… Aldersgate
#> 5 UK Value-Typ… England London City Of L… City of L… Aldersgate
#> 6 UK Value-Typ… England London City Of L… City of L… Aldersgate
#> 7 UK Value-Typ… England London City Of L… City of L… Aldersgate
#> 8 UK Value-Typ… England London City Of L… City of L… Aldersgate
#> 9 UK Value-Typ… England London City Of L… City of L… Aldersgate
#> 10 UK Value-Typ… England London City Of L… City of L… Aldersgate
#> # … with 310 more rows, and 5 more variables: LSOA11NM <chr*>, .model <chr>,
#> # Month <mth>, Total <dist>, .mean <dbl>
Created on 2021-02-11 by the reprex package (v0.3.0)
Session info
devtools::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#> setting value
#> version R version 4.0.2 (2020-06-22)
#> os Ubuntu 20.04.1 LTS
#> system x86_64, linux-gnu
#> ui X11
#> language en_AU:en
#> collate en_AU.UTF-8
#> ctype en_AU.UTF-8
#> tz Australia/Melbourne
#> date 2021-02-11
#>
#> ─ Packages ───────────────────────────────────────────────────────────────────
#> package * version date lib source
#> anytime 0.3.9 2020-08-27 [1] CRAN (R 4.0.2)
#> assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.0.2)
#> backports 1.2.1 2020-12-09 [1] CRAN (R 4.0.2)
#> blob 1.2.1 2020-01-20 [1] CRAN (R 4.0.2)
#> broom 0.7.0 2020-07-09 [1] CRAN (R 4.0.2)
#> callr 3.5.1 2020-10-13 [1] CRAN (R 4.0.2)
#> cellranger 1.1.0 2016-07-27 [1] CRAN (R 4.0.2)
#> cli 2.3.0 2021-01-31 [1] CRAN (R 4.0.2)
#> colorspace 2.0-0 2020-11-11 [1] CRAN (R 4.0.2)
#> crayon 1.4.0 2021-01-30 [1] CRAN (R 4.0.2)
#> DBI 1.1.0 2019-12-15 [1] CRAN (R 4.0.2)
#> dbplyr 1.4.4 2020-05-27 [1] CRAN (R 4.0.2)
#> desc 1.2.0 2018-05-01 [1] CRAN (R 4.0.2)
#> devtools 2.3.2 2020-09-18 [1] CRAN (R 4.0.2)
#> digest 0.6.27 2020-10-24 [1] CRAN (R 4.0.2)
#> distributional 0.2.1 2020-10-06 [1] CRAN (R 4.0.2)
#> dplyr * 1.0.4 2021-02-02 [1] CRAN (R 4.0.2)
#> ellipsis 0.3.1 2020-05-15 [1] CRAN (R 4.0.2)
#> evaluate 0.14 2019-05-28 [1] CRAN (R 4.0.2)
#> fable * 0.3.0 2021-02-02 [1] local
#> fabletools * 0.3.0.9000 2021-02-02 [1] local
#> fansi 0.4.2 2021-01-15 [1] CRAN (R 4.0.2)
#> farver 2.0.3 2020-01-16 [1] CRAN (R 4.0.2)
#> feasts 0.1.7 2021-02-08 [1] local
#> forcats * 0.5.1 2021-01-27 [1] CRAN (R 4.0.2)
#> fs 1.5.0 2020-07-31 [1] CRAN (R 4.0.2)
#> generics 0.1.0 2020-10-31 [1] CRAN (R 4.0.2)
#> ggplot2 * 3.3.3 2020-12-30 [1] CRAN (R 4.0.2)
#> glue 1.4.2 2020-08-27 [1] CRAN (R 4.0.2)
#> gtable 0.3.0 2019-03-25 [1] CRAN (R 4.0.2)
#> haven 2.3.1 2020-06-01 [1] CRAN (R 4.0.2)
#> highr 0.8 2019-03-20 [1] CRAN (R 4.0.2)
#> hms 1.0.0 2021-01-13 [1] CRAN (R 4.0.2)
#> htmltools 0.5.1 2021-01-12 [1] CRAN (R 4.0.2)
#> httr 1.4.2 2020-07-20 [1] CRAN (R 4.0.2)
#> jsonlite 1.7.2 2020-12-09 [1] CRAN (R 4.0.2)
#> knitr 1.30 2020-09-22 [1] CRAN (R 4.0.2)
#> lattice 0.20-41 2020-04-02 [2] CRAN (R 4.0.2)
#> lifecycle 0.2.0 2020-03-06 [1] CRAN (R 4.0.2)
#> lubridate 1.7.9.2 2020-11-13 [1] CRAN (R 4.0.2)
#> magrittr 2.0.1 2020-11-17 [1] CRAN (R 4.0.2)
#> Matrix 1.2-18 2019-11-27 [2] CRAN (R 4.0.2)
#> memoise 1.1.0 2017-04-21 [1] CRAN (R 4.0.2)
#> modelr 0.1.8 2020-05-19 [1] CRAN (R 4.0.2)
#> munsell 0.5.0 2018-06-12 [1] CRAN (R 4.0.2)
#> nlme 3.1-148 2020-05-24 [2] CRAN (R 4.0.2)
#> pillar 1.4.7 2020-11-20 [1] CRAN (R 4.0.2)
#> pkgbuild 1.2.0 2020-12-15 [1] CRAN (R 4.0.2)
#> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.0.2)
#> pkgload 1.1.0 2020-05-29 [1] CRAN (R 4.0.2)
#> prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.0.2)
#> processx 3.4.5 2020-11-30 [1] CRAN (R 4.0.2)
#> progressr 0.7.0 2020-12-11 [1] CRAN (R 4.0.2)
#> ps 1.5.0 2020-12-05 [1] CRAN (R 4.0.2)
#> purrr * 0.3.4 2020-04-17 [1] CRAN (R 4.0.2)
#> R6 2.5.0 2020-10-28 [1] CRAN (R 4.0.2)
#> Rcpp 1.0.6 2021-01-15 [1] CRAN (R 4.0.2)
#> readr * 1.4.0 2020-10-05 [1] CRAN (R 4.0.2)
#> readxl 1.3.1 2019-03-13 [1] CRAN (R 4.0.2)
#> remotes 2.2.0 2020-07-21 [1] CRAN (R 4.0.2)
#> reprex 0.3.0 2019-05-16 [1] CRAN (R 4.0.2)
#> rlang 0.4.10 2020-12-30 [1] CRAN (R 4.0.2)
#> rmarkdown 2.6 2020-12-14 [1] CRAN (R 4.0.2)
#> rprojroot 2.0.2 2020-11-15 [1] CRAN (R 4.0.2)
#> rvest 0.3.6 2020-07-25 [1] CRAN (R 4.0.2)
#> scales 1.1.1 2020-05-11 [1] CRAN (R 4.0.2)
#> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.0.2)
#> stringi 1.5.3 2020-09-09 [1] CRAN (R 4.0.2)
#> stringr * 1.4.0 2019-02-10 [1] CRAN (R 4.0.2)
#> testthat 3.0.1 2020-12-17 [1] CRAN (R 4.0.2)
#> tibble * 3.0.6 2021-01-29 [1] CRAN (R 4.0.2)
#> tidyr * 1.1.2 2020-08-27 [1] CRAN (R 4.0.2)
#> tidyselect 1.1.0 2020-05-11 [1] CRAN (R 4.0.2)
#> tidyverse * 1.3.0 2019-11-21 [1] CRAN (R 4.0.2)
#> tsibble * 1.0.0 2021-02-05 [1] Github (tidyverts/tsibble@722cc86)
#> urca 1.3-0 2016-09-06 [1] CRAN (R 4.0.2)
#> usethis 1.6.3 2020-09-17 [1] CRAN (R 4.0.2)
#> utf8 1.1.4 2018-05-24 [1] CRAN (R 4.0.2)
#> vctrs 0.3.6 2020-12-17 [1] CRAN (R 4.0.2)
#> withr 2.4.1 2021-01-26 [1] CRAN (R 4.0.2)
#> xfun 0.20 2021-01-06 [1] CRAN (R 4.0.2)
#> xml2 1.3.2 2020-04-23 [1] CRAN (R 4.0.2)
#> yaml 2.2.1 2020-02-01 [1] CRAN (R 4.0.2)
#>
#> [1] /home/mitchell/R/x86_64-pc-linux-gnu-library/4.0
#> [2] /opt/R/4.0.0/lib/R/library
Great, the issue is solved with the latest version of the packages. Thank you very much!
The problem seemed to re-appear, when using the whole dataset.
I tried capturing some of the rows that seem to be part of the issue, which you will find in the new reprex
. All the packages used are the latest version.
library(fable)
#> Loading required package: fabletools
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(tsibble)
library(tidyverse)
t_london <- tibble::tribble(
~Month, ~Value.type, ~LSOA11NM, ~WD19CD, ~WD19NM, ~LAD19NM, ~CTYNM, ~RGN19NM, ~CNTY21NM, ~NTN21NM, ~Count,
"2016 Dec", "Value-Type2", "City of London 001A", "E05009288", "Aldersgate", "City of London", "City Of London", "London", "England", "UK", 0L,
"2017 Jan", "Value-Type2", "City of London 001A", "E05009288", "Aldersgate", "City of London", "City Of London", "London", "England", "UK", 0L,
"2016 Dec", "Value-Type2", "City of London 001B", "E05009302", "Cripplegate", "City of London", "City Of London", "London", "England", "UK", 1L,
"2017 Jan", "Value-Type2", "City of London 001B", "E05009302", "Cripplegate", "City of London", "City Of London", "London", "England", "UK", 1L,
"2016 Dec", "Value-Type2", "City of London 001C", "E05009302", "Cripplegate", "City of London", "City Of London", "London", "England", "UK", 0L,
"2017 Jan", "Value-Type2", "City of London 001C", "E05009302", "Cripplegate", "City of London", "City Of London", "London", "England", "UK", 0L,
"2016 Dec", "Value-Type2", "City of London 001E", "E05009308", "Portsoken", "City of London", "City Of London", "London", "England", "UK", 0L,
"2017 Jan", "Value-Type2", "City of London 001E", "E05009308", "Portsoken", "City of London", "City Of London", "London", "England", "UK", 1L,
"2016 Dec", "Value-Type2", "City of London 001F", "E05009311", "Vintry", "City of London", "City Of London", "London", "England", "UK", 54L,
"2017 Jan", "Value-Type2", "City of London 001F", "E05009311", "Vintry", "City of London", "City Of London", "London", "England", "UK", 62L,
"2016 Dec", "Value-Type2", "City of London 001G", "E05009304", "Farringdon Within", "City of London", "City Of London", "London", "England", "UK", 12L,
"2017 Jan", "Value-Type2", "City of London 001G", "E05009304", "Farringdon Within", "City of London", "City Of London", "London", "England", "UK", 9L
)
t_london <- t_london %>%
mutate(Month = yearmonth(Month)) %>%
as_tsibble(key = c(LSOA11NM, Value.type), index=Month)
london_full <- t_london %>% aggregate_key((NTN21NM/ CNTY21NM / RGN19NM / CTYNM / LAD19NM / WD19NM /LSOA11NM) * Value.type, Total = sum(Count))
fit <- london_full %>%
model(base = ARIMA(Total)) %>%
reconcile(
bu = bottom_up(base),
ols = min_trace(base, method = "ols"),
mint = min_trace(base, method = "mint_shrink"),
)
#> Warning: 6 errors (1 unique) encountered for base
#> [6] missing value where TRUE/FALSE needed
fc <- fit %>%
forecast(h = 1)
#> Warning in cov2cor(covm): diag(.) had 0 or NA entries; non-finite result is
#> doubtful
#> Warning in cov2cor(tar): diag(.) had 0 or NA entries; non-finite result is
#> doubtful
#> Error: Problem with `mutate()` input `mint`.
#> x infinite or missing values in 'x'
#> ℹ Input `mint` is `(function (object, ...) ...`.
Created on 2021-02-12 by the reprex package (v1.0.0)
get the same issue, are there any updates for this?
are there any updates on this @mitchelloharawild?
thanks in advance!
Hi, please provide a minimal reproducible example. I've just tried reproducing the example above, and the reason why it fails is due to ARIMA models being trained on just 2 observations per series - more data is required to produce sensible output.