miller icon indicating copy to clipboard operation
miller copied to clipboard

Leading numeric zeros in JSON format

Open OXYAMINE opened this issue 2 years ago • 5 comments

I'm converting CSV to JSON and values like "0012AS4" are presented like {"Key": "0012AS4"}. But if the values is like "0123456789" it is presented in JSON output like {"Key": 0123456789 }. Which makes invalid JSON produced.

OXYAMINE avatar May 16 '23 00:05 OXYAMINE

Same issue apples to values like "+12123" they are considered as numbers even if explicitly quoted. Result - invalid JSON output

OXYAMINE avatar May 16 '23 02:05 OXYAMINE

Actually the problem is wider. How a value is considered as number or string during JSON conversion? What if a value has been explicitly quoted, why it is still converted to number?

For example value 1867e593836000726799386923505081003007900978 is considered a number and JSON syntax is ok But obviously this isn't a number and attempt to store it will fail.

OXYAMINE avatar May 16 '23 03:05 OXYAMINE

But if the values is like "0123456789" it is presented in JSON output like {"Key": 0123456789 }. Which makes invalid JSON produced.

Hi, I have used this sample input

fielda,fieldb
a,0123456789

and if I run mlr --c2j cat t.csv I do not have {"Key": 0123456789 }, but {"Key": "0123456789" }

[
{
  "fielda": "a",
  "fieldb": "0123456789"
}
]

I'm using mlr 6.7.0

aborruso avatar May 17 '23 17:05 aborruso

even if explicitly quoted

There are two different things.

For JSON -- "123" means string and 123 means int. Double quotes serve as type-indicators.

For CSV -- double-quotes are there for delimiters -- to allow people to put embedded commas and/or newlines into cells.

The Go CSV-parser library I'm using doesn't return back to the caller information about whether a field was quoted. However, I've already forked and hacked on it a bit; I can look into trying to get back a was-quoted flag from the parser ...

johnkerl avatar Jun 06 '23 19:06 johnkerl

Meanwhile please also check out mlr -S (maybe overkill, but, it does avoid type-inference ...

johnkerl avatar Jun 06 '23 19:06 johnkerl