Infinite dates are not parsed correctly
What seems to be new in R is that infinite dates are formatted as "Inf" when coerced to string. This causes problems in write-read roundtrips, as readr's parser guessing does not accept "Inf" as a valid date:
library(tidyverse)
output <- format_csv(tibble(x = lubridate::as_date(c(0, Inf))))
output
#> [1] "x\n1970-01-01\nInf\n"
input <- suppressMessages(read_csv(output))
attr(input, "spec")
#> cols(
#> x = col_character()
#> )
Assuming this is correct there are some solutions. Personally I would prefer a change on the read side, where read_delim etc. accept "Inf" and "-Inf" as valid dates. It violates iso8601, but it would be nice to preserve the distinction between missing dates and infinite dates.
If that's not an option, there are also solutions on the write side.
- output infinite dates as something more in line with iso8601 like "9999-13-00";
- output as NA (which is functionally as it was before)
- give an error if infinite dates are written
Some parameter to control this is of course possible, but the important choice then, in my view, is what the default option would be.
Do you have more specifics on what version of R or lubridate or readr or ?? had different behaviour?
Yeah it is since R 4.2.0: "Not strictly fixing a bug, format()ing and print()ing of non-finite Date and POSIXt values NaN and ±Inf no longer show as NA but the respective string, e.g., Inf, for consistency with numeric vector's behaviour, fulfilling the wish of (https://bugs.r-project.org/show_bug.cgi?id=18308)."
library(readr)
df <- data.frame(x = .Date(c(0, Inf)))
df
#> x
#> 1 1970-01-01
#> 2 Inf
output <- format_csv(df)
cat(output)
#> x
#> 1970-01-01
#> Inf
read_csv(output, col_types = list())
#> # A tibble: 2 × 1
#> x
#> <chr>
#> 1 1970-01-01
#> 2 Inf
Created on 2023-07-31 with reprex v2.0.2