timetk
timetk copied to clipboard
tk_ts automatic start date detection
Hi Mat, first of all I would like to start this post giving you thanks for this amazing packages!
The idea is that the function tk_ts would be able to select automatically the start date one you want to convert a tbl to a ts object.
Let me show you a reproducible example:
# Create dates
d1 <- seq.Date(as.Date("2016-01-01"), length.out = 3, by = "months")
d2 <- seq.Date(as.Date("2017-01-01"), length.out = 3, by = "months")
# Data frame
df <- data_frame(
id = c(1, 1, 1, 2, 2, 2),
date = c(d1, d2),
value = c(10, 20, 30, 10, 20, 30)
)
# A tibble: 6 x 3
id date value
<dbl> <date> <dbl>
1 1 2016-01-01 10
2 1 2016-02-01 20
3 1 2016-03-01 30
4 2 2017-01-01 10
5 2 2017-02-01 20
6 2 2017-03-01 30
Each serie start at different dates so we cannot use tk_ts
a <- df %>%
group_by(id) %>%
nest() %>%
mutate(data_ts = map(data, tk_ts, frequency = 12))
> a$data_ts
[[1]]
Jan Feb Mar
1 10 20 30
[[2]]
Jan Feb Mar
1 10 20 30
So, I have developed a workaround that helps me in this step and may be useful to incorporate it in the tk_ts function
# Auxiliar function. Assumes that dates are ordered
to_ts <- function(data, freq = 12L) {
dates <- data$date
value <- data$value
ts(value, start = c(year(dates)[1], month(dates)[1]), frequency = freq)
}
b <- df %>%
group_by(id) %>%
nest() %>%
mutate(data_ts = map(data, to_ts))
> b$data_ts
[[1]]
Jan Feb Mar
2016 10 20 30
[[2]]
Jan Feb Mar
2017 10 20 30
Hope that helps! Cheers
Thanks for the kind words!!
I've thought about this as well, but it becomes a pretty difficult problem when the dealing with all the different types of time-based data. It's a good idea and maybe we can use your logic as a start to an auto-detect start/freq approach.
I think it makes a lot of sense and also follows the principle of least surprise. I had some tibbles starting in mid year (june or july) and after forecasting and converting back it started from january which was very unexpected for me.
I'm not sure what a good fallback strategy would be in case auto detection is impossible or ambiguous.