timetk icon indicating copy to clipboard operation
timetk copied to clipboard

tk_ts automatic start date detection

Open ghost opened this issue 8 years ago • 2 comments

Hi Mat, first of all I would like to start this post giving you thanks for this amazing packages!

The idea is that the function tk_ts would be able to select automatically the start date one you want to convert a tbl to a ts object.

Let me show you a reproducible example:

# Create dates
d1 <- seq.Date(as.Date("2016-01-01"), length.out = 3, by = "months")
d2 <- seq.Date(as.Date("2017-01-01"), length.out = 3, by = "months")

# Data frame
df <- data_frame(
  id = c(1, 1, 1, 2, 2, 2), 
  date = c(d1, d2),
  value = c(10, 20, 30, 10, 20, 30)
  )

# A tibble: 6 x 3
     id       date value
  <dbl>     <date> <dbl>
1     1 2016-01-01    10
2     1 2016-02-01    20
3     1 2016-03-01    30
4     2 2017-01-01    10
5     2 2017-02-01    20
6     2 2017-03-01    30

Each serie start at different dates so we cannot use tk_ts

a <- df %>% 
  group_by(id) %>% 
  nest() %>%
  mutate(data_ts = map(data, tk_ts, frequency = 12))

> a$data_ts
[[1]]
  Jan Feb Mar
1  10  20  30

[[2]]
  Jan Feb Mar
1  10  20  30

So, I have developed a workaround that helps me in this step and may be useful to incorporate it in the tk_ts function

# Auxiliar function. Assumes that dates are ordered
to_ts <- function(data, freq = 12L) {
  dates <- data$date
  value <- data$value
  ts(value, start = c(year(dates)[1], month(dates)[1]), frequency = freq)
}

b <- df %>% 
  group_by(id) %>% 
  nest() %>%
  mutate(data_ts = map(data, to_ts))

> b$data_ts
[[1]]
     Jan Feb Mar
2016  10  20  30

[[2]]
     Jan Feb Mar
2017  10  20  30

Hope that helps! Cheers

ghost avatar Oct 25 '17 13:10 ghost

Thanks for the kind words!!

I've thought about this as well, but it becomes a pretty difficult problem when the dealing with all the different types of time-based data. It's a good idea and maybe we can use your logic as a start to an auto-detect start/freq approach.

mdancho84 avatar Oct 25 '17 14:10 mdancho84

I think it makes a lot of sense and also follows the principle of least surprise. I had some tibbles starting in mid year (june or july) and after forecasting and converting back it started from january which was very unexpected for me.

I'm not sure what a good fallback strategy would be in case auto detection is impossible or ambiguous.

Fuco1 avatar Jan 03 '19 15:01 Fuco1