CSV.jl
CSV.jl copied to clipboard
Relax type restrictions on delim
This issue is inspired by https://discourse.julialang.org/t/reading-data-text-files-delimited-with-both-spaces-tabs/64851
It may happen, that sometimes you need more than one character as a delimiter (as in this discourse discussion). Currently delim type is restricted as Union{Nothing, Char, String}.If, instead, restrictions were Union{Nothing, AbstractChar, String} (or even better AbstractString instead of String, but this is of lesser importance), then one can define
struct MultiChar{T} <: AbstractChar
char::T
end
delim = MultiChar(('\t', ' '))
and with appropriate equal methods, CSV.read can properly read csv with a mixture of chars as a delimiter.
The bigger issue here is that for the Parsers.jl package, if you pass in a Char delim, it checks if it's ascii and if so, converts it to a UInt8. Otherwise, for multi-byte Chars, it converts them to a string. So the Parsers.jl parsing code assumes delim will either be a UInt8 or String.