Fix #866: Add constructor for tag:yaml.org,2002:value
Problem
PyYAML fails to parse YAML documents containing standalone = characters with:
This affects real-world use cases like Prometheus CRDs with enum values containing =.
Root Cause
The YAML resolver correctly identifies standalone = as having the tag tag:yaml.org,2002:value, but no constructor was defined for this tag in the SafeConstructor class.
Solution
- Added constructor mapping for
tag:yaml.org,2002:valuetoconstruct_yaml_str - Treats standalone
=as string scalar (consistent with YAML 1.1 specification) - Minimal, targeted fix with no breaking changes
- Single line addition to
lib/yaml/constructor.py
Testing
- ✅ Verified fix works with SafeLoader, FullLoader, UnsafeLoader
- ✅ Tested with real-world Prometheus CRD YAML from the issue
- ✅ Tested edge cases: flow sequences, mappings, mixed quoting
- ✅ Verified no regressions in existing functionality
- ✅ Test Script
import sys
import os
# Add the local lib directory to the path to use our modified version
sys.path.insert(0, os.path.join(os.getcwd(), 'lib'))
import requests
import yaml
url = "https://github.com/prometheus-operator/prometheus-operator/releases/download/v0.83.0/stripped-down-crds.yaml"
# Download the spec
response = requests.get(url)
response.raise_for_status()
res = response.text
# Parse multi-document YAML
documents = list(yaml.safe_load_all(res))
print("Parsed YAML documents:")
for i, doc in enumerate(documents, 1):
print(f"\nDocument {i}:")
print(doc)
Files Changed
-
lib/yaml/constructor.py: Added constructor fortag:yaml.org,2002:value
Backwards Compatibility
✅ No breaking changes - only adds support for previously unsupported syntax
Fixes #866
Any news on this one ?
Thank you for providing a fix!
Is there a time frame when we can expect this to be available?
I'm confused why anyone is pushing back on this and saying that = should fail on parsing. = is not a special char in yaml. It definitely gives me that fuzzy special feeling about it, and it seems like maybe it might be special some day, but it's not as of yaml 1.2. I would love to see this fix approved and merged in, because the npm package yaml is the fastest growing yaml support package for JavaScript projects, and it would default to serialize { compare: '=' } as compare: = , not compare: '='. I also can't imagine how this fix could possibly be a breaking change.
I'm confused why anyone is pushing back on this and saying that = should fail on parsing. = is not a special char in yaml. It definitely gives me that fuzzy special feeling about it, and it seems like maybe it might be special some day, but it's not as of yaml 1.2.
PyYAML implements YAML 1.1, and there it is special: https://yaml.org/type/value.html
The spec only talks about it when it's a mapping key. It doesn't specifically say what the library should do when it's not a mapping key, but returning the simple string seems the best thing to do, that's right.
I would love to see this fix approved and merged in, because the npm package yaml is the fastest growing yaml support package for JavaScript projects, and it would default to serialize { compare: '=' } as compare: = , not compare: '='.
You mention the node yaml package. In case you are using YAML 1.2, then it would make sense to try https://pypi.org/project/yamlcore/ which does not treat = specially.
(I know node yaml supports YAML 1.1 as well, but often people are not even aware which version they are using, so I thought I'd rather mention this possibility.
Similar fix in https://github.com/yaml/pyyaml/pull/635
And applied to issue https://github.com/yaml/pyyaml/issues/89