tsibble
tsibble copied to clipboard
allow measurement units via units package?
This reprex illustrates the problem:
library (tsibble)
library (units)
#> udunits system database from /usr/share/udunits
daily <- set_units (1:100, "day")
class (daily)
#> [1] "units"
x <- tsibble (day = daily,
index = daily)
#> Error: Must extract column with a single valid subscript.
#> ✖ Subscript `var` has the wrong type `units`.
#> ℹ It must be numeric or character.
Created on 2020-06-25 by the reprex package (v0.3.0)
With due acknowledgement of your statement in #134 that
With respect to modelling, there's no difference between 1 year and 1 unit
there is nevertheless a difference in internal representations within software. In this case, it may be considered important to retain explicit specifications of measurement units, here via the units package. This is arguably the only way to put an absolute scale on interval data which have no fixed time scale, and that is surely an important thing to be able to do?
Minor MRE fix: to create a tsibble, the index argument should match a column name, not a value:
library (tsibble)
library (units)
#> udunits system database from /usr/share/xml/udunits
daily <- set_units (1:100, "day")
class(daily)
#> [1] "units"
x <- tsibble (day = daily, index = day)
#> Error: Unsupported index type: units
Created on 2020-06-25 by the reprex package (v0.3.0)
Are you looking for relative days as index instead of absolute dates? I'd suggest to use hms::hms() or lubridate::period() natively supported by tsibble.
library(tsibble)
daily <- hms::hms(day = 1:100)
tsibble (day = daily, index = day)
#> # A tsibble: 100 x 1 [24h]
#> day
#> <time>
#> 1 24:00
#> 2 48:00
#> 3 72:00
#> 4 96:00
#> 5 120:00
#> 6 144:00
#> 7 168:00
#> 8 192:00
#> 9 216:00
#> 10 240:00
#> # … with 90 more rows
Created on 2020-06-26 by the reprex package (v0.3.0)
Tsibble supports commonly-used time classes. It's up to the package developer to implement custom index classes and their associated intervals for tsibble in the package.
Thanks @earowang, but the problem is that the straightforward ways to implement intervals do not work:
library (tsibble)
library (units)
#> udunits system database from /usr/share/udunits
daily <- set_units (1:100, "day")
x <- tsibble (day = daily, index = day)
#> Error: Unsupported index type: units
library (lubridate)
daily <- days (1:100)
x <- tsibble (day = daily, index = day)
#> Error in vec_proxy_period(x): trying to get slot "year" from an object (class "Period") that is not an S4 object
Created on 2020-06-26 by the reprex package (v0.3.0)
The only standard units which seem acceptable are absolute ones (lubridate::hms and the like), but no relative units seem to work at all (including none of the lubridate::Period-class ones). The most straightforward way to specify intervals is to use either the units package or these Period-class objects, yet neither of these work. I would also suggest that simple specification of intervals shouldn't be relegated to "custom classes" - this is a very general task that I think would find very general use, and code like that above should simply work.
The lubridate::period() index support is very recent, and sits in the gh dev.
library(lubridate)
#>
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#>
#> date, intersect, setdiff, union
library(tsibble)
#>
#> Attaching package: 'tsibble'
#> The following object is masked from 'package:lubridate':
#>
#> interval
daily <- days(1:100)
tsibble(day = daily, index = day)
#> # A tsibble: 100 x 1 [1D]
#> day
#> <Period>
#> 1 1d 0H 0M 0S
#> 2 2d 0H 0M 0S
#> 3 3d 0H 0M 0S
#> 4 4d 0H 0M 0S
#> 5 5d 0H 0M 0S
#> 6 6d 0H 0M 0S
#> 7 7d 0H 0M 0S
#> 8 8d 0H 0M 0S
#> 9 9d 0H 0M 0S
#> 10 10d 0H 0M 0S
#> # … with 90 more rows
Created on 2020-06-26 by the reprex package (v0.3.0)
Awesome! Any chance of similar integration of units? It is the interface in R to the udunits2 library, so should be considered the definitive implementation of units in R, including units for time series. Ping @edzer
That would definitely ease the adoption by people from the modelling communities who use udunits2.
Note however that units and calendars have a difficult relationship, here you will read: "CAUTION: The timestamp-unit was created to be analogous to, for example, the degree celsius—but for the time dimension. I've come to believe, however, that creating such a unit was a mistake, primarily because users try to use the unit in ways for which it was not designed (such as converting dates in a calendar whose year is exactly 365 days long). Such activities are much better handled by a dedicated calendar package. Please be careful about using timestamp-units." illustrated by
> library(units)
udunits system database from /usr/share/xml/udunits
> set_units(set_units(1, "year"), "days")
365.2422 [d]
R's native time/date classes (Date, POSIXt) don't give difftime objects with units "months" or "years", for that reason.