TextParse.jl
TextParse.jl copied to clipboard
Make date format guess algorithm more robust
I've been running into a weird date parsing issue, and I can't sort out what the pattern is, though I've managed to nail down a MWE
The linked csv has 4 rows of dates.
julia> csvread("parse_test.csv")
ERROR: ArgumentError: Month: 27 out of range (1:12)
Stacktrace:
[1] Date(::Int64, ::Int64, ::Int64) at ./dates/types.jl:204
[2] tryparsenext(::TextParse.DateTimeToken{Date,DateFormat{Symbol("yyyy/mm/dd"),Tuple{Base.Dates.DatePart{'y'},Base.Dates.Delim{Char,1},Base.Dates.DatePart{'m'},Base.Dates.Delim{Char,1},Base.Dates.DatePart{'d'}}}}, ::String, ::Int64, ::Int64, ::TextParse.LocalOpts) at /Users/kev/.julia/v0.6/TextParse/src/field.jl:431
[3] macro expansion at /Users/kev/.julia/v0.6/TextParse/src/util.jl:23 [inlined]
[4] tryparsenext(::TextParse.Field{Date,TextParse.DateTimeToken{Date,DateFormat{Symbol("yyyy/mm/dd"),Tuple{Base.Dates.DatePart{'y'},Base.Dates.Delim{Char,1},Base.Dates.DatePart{'m'},Base.Dates.Delim{Char,1},Base.Dates.DatePart{'d'}}}}}, ::String, ::Int64, ::Int64, ::TextParse.LocalOpts) at /Users/kev/.julia/v0.6/TextParse/src/field.jl:569
#...
(the stack trace is super long, let me know if it would be useful to post the whole thing)
There are 3 27
s, two in the second row, and one in the last row. If I remove just the last row, it works.
But if I leave the 4th row in and just change the 27
in the last row to a 2
, I get the same ERROR: ArgumentError: Month: 27 out of range (1:12)
.
If I change all the 27
s to 2
s, I now get ERROR: ArgumentError: Month: 21 out of range (1:12)
, and again this error goes away if I delete the last row, even though there are no 21
s in the last row.
There's not just something weird with that row - this is part of a much larger csv
file, and removing only row 4 does not stop the error.
Note - originally posted as issue to CSVFiles.jl, but this error seems to be caused by this package.