lubridate icon indicating copy to clipboard operation
lubridate copied to clipboard

floor_date() does not preserve NA on R devel

Open nealrichardson opened this issue 3 years ago • 3 comments

Ran into this very fine edge case in https://github.com/apache/arrow/pull/14282#issuecomment-1264031491. Due to a recent change in R devel (my guess is https://github.com/wch/r-source/commit/4f70ce0d79eeda7464cf97448e515275cbef754b), floor_date() can erroneously return 1970-01-01 for an NA Date input.

# Current R:
> lubridate::floor_date(as.Date(NA), "1 day")
[1] NA

# R at commit 82963
> lubridate::floor_date(as.Date(NA), "1 day")
[1] "1970-01-01"

I dug a bit and saw that it has to do with lubridate:::floor_multi_unit1(): on NA input, it returns NA_real_ instead instead of the integer type that belongs in all of the fields of a POSIXlt object. This used to be fine:

# Current R
> as.Date(structure(list(sec =  0L, min =  0L, hour =  0L, mday = NA_real_, 
+     mon = NA_integer_, year = NA_integer_, wday = NA_integer_, 
+     yday = NA_integer_, isdst = -1L), class = c("POSIXlt", "POSIXt"
+ ), tzone = "UTC"))
[1] NA

But in R devel, it no longer does what is expected:

> as.Date(structure(list(sec =  0L, min =  0L, hour =  0L, mday = NA_real_, 
+     mon = NA_integer_, year = NA_integer_, wday = NA_integer_, 
+     yday = NA_integer_, isdst = -1L), class = c("POSIXlt", "POSIXt"
+ ), tzone = "UTC"))
[1] "1970-01-01"

though if mday were integer and not real, you get the expected NA result:

> as.Date(structure(list(sec =  0L, min =  0L, hour =  0L, mday = NA_integer_, 
+     mon = NA_integer_, year = NA_integer_, wday = NA_integer_, 
+     yday = NA_integer_, isdst = -1L), class = c("POSIXlt", "POSIXt"
+ ), tzone = "UTC"))
[1] NA

Arguably, R should handle this correctly. Happy to report this upstream if you think that's best. Though one could also argue that sticking something that's not an integer in a POSIXlt is breaking the data contract. (That said, I didn't see anywhere in the R docs that said all fields had to be integers, and to support fractional seconds, it would have to accept numeric at least in secs.)

Either way, it seems like this could be easily worked around in lubridate by having floor_multi_unit1() wrap its output in as.integer(). Happy to submit a PR for this if you'd like.

nealrichardson avatar Oct 02 '22 19:10 nealrichardson

This issue has lead {admiral} to fail on R-devel. While I did manage to add a workaround to our code to explicitly handle this case I'd like to see this being fixed here.

thomas-neitmann avatar Oct 05 '22 08:10 thomas-neitmann

slider also fails on r-devel (CRAN pinged me about it with a request that I fix slider, but it is a lubridate issue).

I was able to get r-devel installed so I'll see if I can get a PR together...

Here is another problem (potentially the same issue). The 2nd element here should be NA

> as.Date("2019-01-30") + months(0:3)
[1] "2019-01-30" "1970-01-30" "2019-03-30" "2019-04-30"

DavisVaughan avatar Oct 05 '22 19:10 DavisVaughan

I've tracked down the issue with + months(0:3) and that one looks like a r-devel bug so I posted here: https://stat.ethz.ch/pipermail/r-devel/2022-October/082066.html

DavisVaughan avatar Oct 05 '22 21:10 DavisVaughan

Fixed in R-devel.

vspinu avatar Oct 24 '22 19:10 vspinu