psych icon indicating copy to clipboard operation
psych copied to clipboard

Missing support for "YAML Schema" (spec 1.2)

Open ulidtko opened this issue 1 year ago • 0 comments

Hi!

Let me open a new thread to discuss a solution to some of the annoyingly recurring issues.

Parsing of Date, right?.. #262 huge thread, #393, #483, from the oldest #137 to new generation #672, #676; there's more.

Datetime in the same bucket.

Parsing of Symbol — far from a hardcore Ruby-head myself — but I see it in the same bucket, too.

The thing is: spec 1.2 solves it all. Let me explain.

Spec TL;DR

I'll reword from Chapter 10 of the spec, the last chapter.

It defines:

  • Failsafe Schema,
  • JSON Schema,
  • Core Schema,
  • and hints at Other Schemas.

To simplify, think of "YAML Schema" as a list of classes (yaml "tags") that are allowed to de/serialize.

The FAILSAFE_SCHEMA comprises tags [map, seq, str].

The JSON_SCHEMA comprises tags [null, bool, int, float] plus those from FAILSAFE_SCHEMA.

The CORE_SCHEMA comprises the same tags as JSON_SCHEMA. (Find exact difference in the spec.)

Why did I change to ALL_CAPS?.. Because these are supposed to be in yaml library API. :warning:

Example

I'd recently hit an issue trying to validate a bunch of yamls containing... dates, you guessed it — against a pre-existing json-schema.

Used json_schemer, nice library. Long story short, https://github.com/davishmcclurg/json_schemer/issues/203 — it didn't work well. I found no better solution than a filthy monkey-patch.

It's because of a basic impedance mismatch:

  • The best json-schema can do, is {"type": "string", "format": "date"}. Reference if you don't believe me. (Although should be obvious: there's no concept of Date type in JSON, just ~regexes~ ABNF).
    • The validator library fully supports that.
  • YAML, however, has a notion of date tag (type) — which is distinct from str tag (a string). That's why field: "2024-12-27" and field: 2024-12-27 do not — and should not! — have the same meaning and behavior.
    • The validator library blows up when verifying {"type": "string", "format": "date"} on #<Date: 2024-12-26 ((2460671j,0s,0n),+0s,-Infj)> parsed from yaml, saying … is not a string. It indeed, correctly, isn't.

What gives

Now, a bit of cross-pollination.

JavaScript folks had hit the exact same conundrum, not so long ago. https://github.com/ajv-validator/ajv-cli/issues/122

How did they solve it?.. Well, their yaml library supports :sparkles: YAML Schemas! :sparkles:

image

So their fix was literally a single-line change that switched the validator's parser to CORE_SCHEMA. YAML-native date tag no more; date validation issues no more; solved, done!

Now, I can't do the same using Psych, can I?

Will I?

ulidtko avatar Dec 27 '24 20:12 ulidtko