psych icon indicating copy to clipboard operation
psych copied to clipboard

Inconsistent quoting behavior when serializing strings starting with zero

Open jefmathiot opened this issue 1 year ago • 3 comments

Hello there,

we detected a weird behavior when serializing UUIDs. Some of them, such as 0054971b-b746-4776-ae7a-d942cd511393, are quoted during dump, whereas others, such as 0645cada-c729-48cb-a4e3-9d8cbc7cb510 aren't.

The observed behavior seems related to this commit created to fix this issue with floats handling.

The YAML 1.2 schema uses this regexp /[-+]?(\.[0-9]+|[0-9]+(\.[0-9]*)?)([eE][-+]?[0-9]+)?/ to match floats, but on Psych side this regexp is used: /\A0[0-7]*[89]/, which leads in my understanding to an inconsistent behavior.

I'd be glad to craft a PR but I' not sure whether Psych behavior is expected or not.

Thank you :)

jefmathiot avatar Dec 20 '24 13:12 jefmathiot

Hi,

Yes any patches are welcome. It seems like neither of these should be quoted. Perhaps the regular expression is missing an anchor? The second RE you quoted is for handling Octal numbers (not float).

tenderlove avatar Jan 16 '25 23:01 tenderlove

Hi @tenderlove,

sorry for late answer, it seems I missed your comment /o. Yes, I guess adding an anchor would probably fix the problem /\A0[0-7]*[89]\z/.

jefmathiot avatar Apr 29 '25 16:04 jefmathiot

I've just bumped into this at work... and I was about to write a PR, when I realized that there is an open one that seems to solve the problem: #628 (uses the longer YAML 1.2 regex).

Psych 5.2.6:

y = Psych.dump({"id" => "0825f34a-7a8f-4f0b-bb53-fdd456004a67"})
#  => "---\nid: '0825f34a-7a8f-4f0b-bb53-fdd456004a67'\n"

Psych.load(y)
#  => {"id" => "0825f34a-7a8f-4f0b-bb53-fdd456004a67"}

PR #628:

y = Psych.dump({"id" => "0825f34a-7a8f-4f0b-bb53-fdd456004a67"})
#  => "---\nid: 0825f34a-7a8f-4f0b-bb53-fdd456004a67\n"

Psych.load(y)
#  => {"id" => "0825f34a-7a8f-4f0b-bb53-fdd456004a67"}

CarlosCD avatar May 15 '25 18:05 CarlosCD