yaml-spec icon indicating copy to clipboard operation
yaml-spec copied to clipboard

Feature Request: Allow Underscore in Numeric Literals Again

Open macdjord opened this issue 1 year ago • 6 comments

YAML 1.1 allowed numeric literals, i.e. ints and floats, to include underscores, which were semantically ignored. They could thus be used for digit grouping in long values, much as commas are often used as thousands-separators in human-readable text. This was very convenient, and greatly improved readability for large values.

YAML 1.2.0 dropped this feature, for no reason I can find documented anywhere.

I propose that YAML 1.3 bring this feature back.

macdjord avatar Aug 24 '24 01:08 macdjord

I believe the reason is so that the regular expressions for implicit resolution of !int and !float match the JSON rules, since the aim of YAML 1.2 was to be a strict superset of JSON.

KJTsanaktsidis avatar Sep 11 '24 01:09 KJTsanaktsidis

Sounds like you should stick with YAML 1.1, since it supports 1_0 == 10. There are lots of 1.1 parsers. I think the 1.2 is good as it is, for this question.

eighthave avatar Feb 26 '25 20:02 eighthave

I find the request of the OP legitimate:

  • Grouping of digits can help humans. We've learned this to be handy in a couple of written and programming languages.

I think the counter argument with JSON is flawed:

  • All JSON numbers (without underscores) are parsed the same way in both languages.
  • Numbers with underscores are parsed additionally, which would be an error in JSON. This is good practice in YAML, even for strings.

IMO an oversight in YAML 1.2, because it removed a (useful) feature.

Yet, after reading https://en.m.wikipedia.org/wiki/Decimal_separator I'm not happy with the plethora of possibilities. Maybe keeping it simple-stupid has its merits in here?

Just my 2 cents.

UnePierre avatar Feb 27 '25 04:02 UnePierre

I believe the reason is so that the regular expressions for implicit resolution of !int and !float match the JSON rules, since the aim of YAML 1.2 was to be a strict superset of JSON.

Allowing underscores would still be a strict superset.

Sounds like you should stick with YAML 1.1, since it supports 1_0 == 10. There are lots of 1.1 parsers. I think the 1.2 is good as it is, for this question.

'Just don't upgrade' is never good advice. And, while this is one of the reasons I continue to use 1.1 parsers for my own projects, there are plenty of cases where I need to write YAML files but do not control the parser which will be used to interpret them.

macdjord avatar Feb 27 '25 04:02 macdjord

Allowing underscores would still be a strict superset.

That does seem to be true and I don't know why i suggested otherwise last year.

The relevant part of the spec says https://yaml.org/spec/1.2.2/#1022-tag-resolution

Scalars with the “?” non-specific tag (that is, plain scalars) are matched with a list of regular expressions (first match wins, e.g. 0 is resolved as !!int). In principle, JSON files should not contain any scalars that do not match at least one of these. Hence the YAML processor should consider them to be an error.

I don't know why YAML wouldn't still be a strict JSON superset if it was allowed to implicitly convert unquoted plain scalars with _'s in them to !!int. So I'm not sure why the YAML processor "should consider them to be an error".

And furthermore

Note: The regular expression for float does not exactly match the one in the JSON specification, where at least one digit is required after the dot: ( . [0-9]+ ). The YAML 1.2 specification intended to match JSON behavior, but this cannot be addressed in the 1.2.2 specification.

kind of implies that it's important that they're the same. But I don't know why.

KJTsanaktsidis avatar Feb 27 '25 04:02 KJTsanaktsidis

Oh, i see. If i keep reading, I see that the JSON schema just matches the json regexes, but if you parse a document with the "Core schema", it can turn more things into numbers, e.g.

https://yaml.org/spec/1.2.2/#103-core-schema

0x [0-9a-fA-F]+ tag:yaml.org,2002:int (Base 16)

so yeah. I guess underscore-as-thousands-separator could be in the core-schema and I don't know why it's not.

KJTsanaktsidis avatar Feb 27 '25 04:02 KJTsanaktsidis