tsibble
tsibble copied to clipboard
FR: optional .interval argument to `fill_gaps`
I am joining multiple time series values collected on different intervals, ranging from months to years. Consequently, I need to harmonize the intervals to perform the join.
At the moment, I don't see a documented method for setting the desired interval, either directly, or when calling fill_gaps
.
StackOverflow shows a mechanism for overriding the interval
by explicitly changing the object attribute (see https://stackoverflow.com/a/75981369), but I prefer to use documented interfaces whenever possible.
For my current code, it would be very helpful to have an additional optional .interval
argument to fill_gaps
that performs this step.
Perhaps something like these:
set_interval<-function(object, ...)
{
attr(object, 'interval') <- new_interval(...)
object
}
fill_gaps_interval <- function(.data, ..., .full = FALSE, .start = NULL, .end = NULL, .interval=NULL)
{
if(!is.null(.interval))
{
.interval <- as.list(.interval)
.interval$object <- .data
.data <- do.call(set_interval, .interval)
}
call <- match.call()
call$.data <- .data
call$.interval <- NULL
call[[1L]] <- quote(tsibble::fill_gaps)
eval(call, parent.frame())
}
Reproducable Example:
> library(tidyverse)
> library(tsibble)
> df1 <- tsibble(quarter = yearquarter(as_date(c('2020-1-1','2021-1-1','2022-3-1'))),
+ amount = c(5, 2, 1))
Using `quarter` as index variable.
> df2 <- tsibble(quarter = yearquarter(as_date(c('2022-1-1','2022-4-1','2022-7-1'))),
+ amount = c(5, 2, 1))
Using `quarter` as index variable.
> ###
> # Existing functionality
> ###
>
> interval(df1)
<interval[1]>
[1] 4Q
> # --> Fills 4Q interval
> df1 %>% fill_gaps(.start=yearquarter('2020-01-01'), .end=yearquarter('2023-01-01'))
# A tsibble: 4 x 2 [4Q]
quarter amount
<qtr> <dbl>
1 2020 Q1 5
2 2021 Q1 2
3 2022 Q1 1
4 2023 Q1 NA
> # --> Fills 1Q interval
> interval(df2)
<interval[1]>
[1] 1Q
> df2 %>% fill_gaps(.start=yearquarter('2020-01-01'), .end=yearquarter('2023-01-01'))
# A tsibble: 13 x 2 [1Q]
quarter amount
<qtr> <dbl>
1 2020 Q1 NA
2 2020 Q2 NA
3 2020 Q3 NA
4 2020 Q4 NA
5 2021 Q1 NA
6 2021 Q2 NA
7 2021 Q3 NA
8 2021 Q4 NA
9 2022 Q1 5
10 2022 Q2 2
11 2022 Q3 1
12 2022 Q4 NA
13 2023 Q1 NA
> ###
> # Desired functionality: Fill to individual quarter
> ##
> df1 %>% fill_gaps_interval(.start=yearquarter('2020-01-01'), .end=yearquarter('2023-01-01'), .interval=c(quarter=1))
# A tsibble: 13 x 2 [1Q]
quarter amount
<qtr> <dbl>
1 2020 Q1 5
2 2020 Q2 NA
3 2020 Q3 NA
4 2020 Q4 NA
5 2021 Q1 2
6 2021 Q2 NA
7 2021 Q3 NA
8 2021 Q4 NA
9 2022 Q1 1
10 2022 Q2 NA
11 2022 Q3 NA
12 2022 Q4 NA
13 2023 Q1 NA
> df2 %>% fill_gaps_interval(.start=yearquarter('2020-01-01'), .end=yearquarter('2023-01-01'), .interval=c(quarter=1))
# A tsibble: 13 x 2 [1Q]
quarter amount
<qtr> <dbl>
1 2020 Q1 NA
2 2020 Q2 NA
3 2020 Q3 NA
4 2020 Q4 NA
5 2021 Q1 NA
6 2021 Q2 NA
7 2021 Q3 NA
8 2021 Q4 NA
9 2022 Q1 5
10 2022 Q2 2
11 2022 Q3 1
12 2022 Q4 NA
13 2023 Q1 NA