readr icon indicating copy to clipboard operation
readr copied to clipboard

Infinite dates are not parsed correctly

Open keesdeschepper opened this issue 2 years ago • 3 comments

What seems to be new in R is that infinite dates are formatted as "Inf" when coerced to string. This causes problems in write-read roundtrips, as readr's parser guessing does not accept "Inf" as a valid date:

library(tidyverse)
output <- format_csv(tibble(x = lubridate::as_date(c(0, Inf))))
output
#> [1] "x\n1970-01-01\nInf\n"
input <- suppressMessages(read_csv(output))
attr(input, "spec")
#> cols(
#>   x = col_character()
#> )

Assuming this is correct there are some solutions. Personally I would prefer a change on the read side, where read_delim etc. accept "Inf" and "-Inf" as valid dates. It violates iso8601, but it would be nice to preserve the distinction between missing dates and infinite dates.

If that's not an option, there are also solutions on the write side.

  1. output infinite dates as something more in line with iso8601 like "9999-13-00";
  2. output as NA (which is functionally as it was before)
  3. give an error if infinite dates are written

Some parameter to control this is of course possible, but the important choice then, in my view, is what the default option would be.

keesdeschepper avatar Jan 12 '23 15:01 keesdeschepper

Do you have more specifics on what version of R or lubridate or readr or ?? had different behaviour?

jennybc avatar Jan 12 '23 19:01 jennybc

Yeah it is since R 4.2.0: "Not strictly fixing a bug, format()ing and print()ing of non-finite Date and POSIXt values NaN and ±Inf no longer show as NA but the respective string, e.g., Inf, for consistency with numeric vector's behaviour, fulfilling the wish of (https://bugs.r-project.org/show_bug.cgi?id=18308)."

keesdeschepper avatar Jan 12 '23 19:01 keesdeschepper

library(readr)
df <- data.frame(x = .Date(c(0, Inf)))
df
#>            x
#> 1 1970-01-01
#> 2        Inf

output <- format_csv(df)
cat(output)
#> x
#> 1970-01-01
#> Inf

read_csv(output, col_types = list())
#> # A tibble: 2 × 1
#>   x         
#>   <chr>     
#> 1 1970-01-01
#> 2 Inf

Created on 2023-07-31 with reprex v2.0.2

hadley avatar Jul 31 '23 22:07 hadley