jackson-dataformats-text
jackson-dataformats-text copied to clipboard
(csv) Allow to use multiple-characters delimiter to separate column: withColumnSeparator
(from https://github.com/FasterXML/jackson-dataformat-csv/issues/141 by @jefferyyuan)
It's common that we use multiple-characters delimiter to separate columns in CVS: such as: [|] Sometimes, we have no choice as third-party defines the csv format like that.
Sample data:
FirstName[|]LastName[|]Address
John[|]Smith[|]123 Main St.
https://github.com/FasterXML/jackson-dataformat-csv/issues/92 supports to use multiple characters as array element separator, It would be great if same can be done for withColumnSeparator.
I did some google search: seems there is no java library that supports use multiple-characters as the column separator.
But the C# CsvHelper does https://github.com/JoshClose/CsvHelper
it's been two years -- any news?
@nrapopor It's an OSS project so if you want to help, there are good ways to help. Asking "what's the ETA" is on bottom 5 list of helpful things.
FAQ: if there is progress, there will be update to issue.
ORZ
Hi @cowtowncoder ,
Instead of using
public CsvSchema withColumnSeparator(char sep){}
can we change it to
public CsvSchema withColumnSeparator(char[] sep)
.
and also changes the data type for _columnSeparator
and other needed places.
Let me know of your suggestions.
@djay-S Problem is not the method defining separator but implementation behind it -- splitting columns is very performance-sensitive operation, on critical path. So the real work is in decoder, to try to implement it in a way that has negligible performance penalty. This is the real problem.