pyyaml icon indicating copy to clipboard operation
pyyaml copied to clipboard

Fix #866: Add constructor for tag:yaml.org,2002:value

Open mehulanshumali opened this issue 8 months ago • 5 comments

Problem

PyYAML fails to parse YAML documents containing standalone = characters with:

This affects real-world use cases like Prometheus CRDs with enum values containing =.

Root Cause

The YAML resolver correctly identifies standalone = as having the tag tag:yaml.org,2002:value, but no constructor was defined for this tag in the SafeConstructor class.

Solution

  • Added constructor mapping for tag:yaml.org,2002:value to construct_yaml_str
  • Treats standalone = as string scalar (consistent with YAML 1.1 specification)
  • Minimal, targeted fix with no breaking changes
  • Single line addition to lib/yaml/constructor.py

Testing

  • ✅ Verified fix works with SafeLoader, FullLoader, UnsafeLoader
  • ✅ Tested with real-world Prometheus CRD YAML from the issue
  • ✅ Tested edge cases: flow sequences, mappings, mixed quoting
  • ✅ Verified no regressions in existing functionality
  • ✅ Test Script
import sys
import os
# Add the local lib directory to the path to use our modified version
sys.path.insert(0, os.path.join(os.getcwd(), 'lib'))

import requests
import yaml

url = "https://github.com/prometheus-operator/prometheus-operator/releases/download/v0.83.0/stripped-down-crds.yaml"

# Download the spec
response = requests.get(url)
response.raise_for_status() 
res = response.text

# Parse multi-document YAML
documents = list(yaml.safe_load_all(res))
print("Parsed YAML documents:")
for i, doc in enumerate(documents, 1):
    print(f"\nDocument {i}:")
    print(doc)

Files Changed

  • lib/yaml/constructor.py: Added constructor for tag:yaml.org,2002:value

Backwards Compatibility

✅ No breaking changes - only adds support for previously unsupported syntax

Fixes #866

mehulanshumali avatar Jun 28 '25 06:06 mehulanshumali

Any news on this one ?

Thank you for providing a fix!

Is there a time frame when we can expect this to be available?

hegerdes avatar Aug 03 '25 12:08 hegerdes

I'm confused why anyone is pushing back on this and saying that = should fail on parsing. = is not a special char in yaml. It definitely gives me that fuzzy special feeling about it, and it seems like maybe it might be special some day, but it's not as of yaml 1.2. I would love to see this fix approved and merged in, because the npm package yaml is the fastest growing yaml support package for JavaScript projects, and it would default to serialize { compare: '=' } as compare: = , not compare: '='. I also can't imagine how this fix could possibly be a breaking change.

jkoudys avatar Sep 08 '25 17:09 jkoudys

I'm confused why anyone is pushing back on this and saying that = should fail on parsing. = is not a special char in yaml. It definitely gives me that fuzzy special feeling about it, and it seems like maybe it might be special some day, but it's not as of yaml 1.2.

PyYAML implements YAML 1.1, and there it is special: https://yaml.org/type/value.html

The spec only talks about it when it's a mapping key. It doesn't specifically say what the library should do when it's not a mapping key, but returning the simple string seems the best thing to do, that's right.

I would love to see this fix approved and merged in, because the npm package yaml is the fastest growing yaml support package for JavaScript projects, and it would default to serialize { compare: '=' } as compare: = , not compare: '='.

You mention the node yaml package. In case you are using YAML 1.2, then it would make sense to try https://pypi.org/project/yamlcore/ which does not treat = specially. (I know node yaml supports YAML 1.1 as well, but often people are not even aware which version they are using, so I thought I'd rather mention this possibility.

perlpunk avatar Sep 08 '25 18:09 perlpunk

Similar fix in https://github.com/yaml/pyyaml/pull/635

And applied to issue https://github.com/yaml/pyyaml/issues/89

yevgeny-z avatar Sep 09 '25 07:09 yevgeny-z