node-csv icon indicating copy to clipboard operation
node-csv copied to clipboard

Provide an easy way to handle `null` value

Open somnicattus opened this issue 1 year ago • 1 comments

Summary

Provide an easy way to specify a string value that represents null value.

This issue is related to:

  • csv-parse
  • csv-stringify

Motivation

I utilized these packages to imitate COPY command in SQL and found that there is no official way to handle with null values documented.

Most SQL provides COPY command with an option to specify a string value that represents NULL.

And PostgreSQL treats the value as null only when it is unquoted.

Alternative

No alternative considered.

Draft

Provide following options:

  • null_string: string | undefined - specifies a string value that represents null.
  • unquoted_null: boolean | undefined - should work only with null_string specified. If unquoted_null is not false,
    • with parse, the value specified by null_string is treated as null only when it is unquoted.
    • with stringify, null is emitted as an unquoted null_string value. A string value of null_string is emitted with quotes. This option overrides quoted_empty, quoted_string, quoted_match.
      • If null_string contains delimiter or some special string, unquoted_null won't work.
  • undefined_as_null: boolean | undefined - should work only with null_string specified. If undefied_as_null is not false,
    • with stringify, undefined is emitted as null.

Additional context

I managed to handle null values in a same way as PostgreSQL does, by using cast and quoted_string.

parse({ cast: (v, c) => v === '' && !c.quoting ? null : v, columns: ['id', 'data'] })

// input: id,data\n0,foo\n1,""\n2,
// output: [{ id: '0', data: 'foo' }, { id: '1', data: '' }, { id: '2', data: null }]
stringify({ cast: { string: v => { value: v, quoted_string: v === '', columns: ['id', 'data'] }  } })

// input: [{ id: 0, data: 'foo' }, { id: 1, data: '' }, { id: 2, data: null }]
// output: id,data\n0,foo\n1,""\n2,

It was a little tricky, and stringify could not emit null as unquoted non-empty string. If there was cast.null option for stringify provided, it would be possible.

somnicattus avatar Sep 21 '24 06:09 somnicattus

@somnicattus Thanks for your code suggestions. They worked for my use-case, which is similar to the one you described.

Note that there was is a typo in your code for csv-parse. The context property indicating whether the value was quoted is called quoting, not quoted so your first code block should be written this way:

parse({ cast: (v, c) => v === '' && !c.quoting ? null : v, columns: ['id', 'data'] })

In my case I just wanted csv-parse/csv-stringify to correctly read/write CSV files compatible with PostgreSQL COPY commands (or similarly, psql \copy commands), using the default behavior for NULL values (described below in a quote from PostgreSQL docs). Your code handles this perfectly.

... with the default settings, a NULL is written as an unquoted empty string, while an empty string data value is written with double quotes ("")

But as you pointed out, this would not work when specifying a (non-empty) string value to represent NULL. I do think it would be generally useful for csv-stringify to have cast options for handling both null and undefined.

richard-paragon avatar May 29 '25 23:05 richard-paragon