arrow icon indicating copy to clipboard operation
arrow copied to clipboard

[C++] Enable multiple character delimiters in read_csv

Open asfimport opened this issue 3 years ago • 1 comments

Read_CSV ParseOptions allows only a single character delimiter. Single character delimiters are highly susceptible to the candidate value existing within the data to be loaded, negating the ability to serve as a delimiter.

If a double character delimiter is used, the current limit of a single character returns "only single character unicode strings can be converted to Py_UCS4, got length 2"

Reporter: Jack Howard

Note: This issue was originally created as ARROW-17130. Please see the migration documentation for further details.

asfimport avatar Jul 19 '22 16:07 asfimport

Looking for multiple character delimiters in read and write. Many a times hex is used as delimiters which is multi-character and limitation for current implementation.

ajaytho avatar Jul 17 '23 05:07 ajaytho