lubridate
lubridate copied to clipboard
inconsistent behavior of mdy()
In a case where we have the month name not in the standard three letter abbreviation, the mdy() function performs differently when applied to a vector of date strings compared to a single string. In the following example, mdy() fails to find the %Om %d, %Y format when given the date all by itself, but when another character string is added to the function, it can find this format.
library(lubridate)
#>
#> Attaching package: 'lubridate'
#> The following object is masked from 'package:base':
#>
#> date
lubridate.verbose = TRUE
options(lubridate.verbose = TRUE)
mdy(c('Feb 13, 2012'))
#> 1 parsed with %b %d, %Y
#> [1] "2012-02-13"
mdy(c('Sept 13, 1978'))
#> Warning: All formats failed to parse. No formats found.
#> [1] NA
mdy(c('Sept 13, 1978', 'Feb 13, 2012', 'Feb 13, 2012'))
#> 2 parsed with %b %d, %Y
#> 1 parsed with %Om %d, %Y
#> [1] "1978-09-13" "2012-02-13" "2012-02-13"
Created on 2020-03-12 by the reprex package (v0.3.0)
The issue comes down to guess_format
which is not able to detect the non-standard abreviation:
> guess_formats(c('Sept 13, 1978'), "mdy")
NULL
> guess_formats(c('Sept 13, 1978', 'Feb 13, 2012', 'Feb 13, 2012'), "mdy")
Omdy Omdy mdy mdy
"%Om %d, %Y" "%Om %d, %Y" "%b %d, %Y" "%b %d, %Y"
The non-standard abbrev is parsed by the internal parser, which for English would parse others as well.
This will be fixed once I have rewritten the parser to handle all locales without the reliance on strptime.
I have found a similar problem, but I don't know if it is the same. I am new to GitHub and I wasn't sure if I should post this as a separate issue or not, so apologies in advance. The mdy() function fails to recognize strings with "mar" or "Mar" on some occasions. Similar to @dereksonderegger problem, this behavior is inconsistent.
library(lubridate)
#>
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#>
#> date, intersect, setdiff, union
mdy('Mar 13, 1999')
#> Warning: All formats failed to parse. No formats found.
#> [1] NA
mdy('mar 13, 1999')
#> Warning: All formats failed to parse. No formats found.
#> [1] NA
dmy('09 mar 2050')
#> Warning: All formats failed to parse. No formats found.
#> [1] NA
dmy('01 Mar 2020')
#> Warning: All formats failed to parse. No formats found.
#> [1] NA
mdy('Mar 13, 1963','mar 10, 1920', 'apr 29, 2012', 'dec 16, 2012')
#> [1] "1963-03-13" "1920-03-10" "2012-04-29" "2012-12-16"
I know next to nothing about coding, so I don't know if this is an issue about the function itself or a problem with my system. I am using R version 3.6.3 (2020-02-29), Platform: x86_64-w64-mingw32/x64 (64-bit). Again, sorry for any mistakes I made. If additional information is required, please let me know.
@pirx90 which locale are you in?
It's probably the same issue. See also #881 and the workaround there for time being.
@vspinu I'm using "Spanish Mexico".