pyyaml
pyyaml copied to clipboard
Load comments
How do I load (and dump) comments? In the following code, how can I print out the first comment?
import yaml
code = '''# The following key opens a door
key: value
# this is another comment'''
loaded = yaml.load(code)
# CODE HERE: How to print out 'The following key opens a door'?
https://pypi.python.org/pypi/ruamel.yaml is an alternate implementation that specifically covers this use-case
Any chance for this to ever become a reality? I am using pyyaml with ruamel.yaml in tandem for linting purposes, where there is a real need to also real comments. Still, the current development of ruamel makes it a dependency liability.
Probably not unless someone's volunteering to do all the legwork for it. Comments are completely discarded down at the scanner level by both libyaml and pyyaml, so there's a lot of plumbing required across multiple projects to even make comments a "thing", and you'd still probably have to do custom plumbing at parser level to make a "side lookup" or something that you could consult, or serialize them into your output dict using custom logic. I just can't think of a standard way to do it without those intermediate representations.
How is it going, is it already on the road map? I would really like this feature and I don't want to migrate to another library just because of it :c
My previous statement still applies- pyyaml doesn't have any notion of format preservation, since it deserializes to basic Python types by default. Comments are only recognized at the very lowest level of the parser, just enough to discard them from the document stream. IIUC, ruamel was basically rewritten from the ground up to accommodate format preservation and document round-tripping (and it's taken them a few tries to get it right)- that seems unlikely for this project, where backwards compatibility and stability is the primary community expectation.
Even if someone volunteered to do the work- where should the comments get stashed? Comments could theoretically be stored as dynamic instance attributes on the deserialized objects, but most stock Python types disallow addition of instance attributes, so we'd have to redefine or proxy all the base types to support comments, which has backwards-incompatible serialization and type implications for, well, pretty much everything.
The parser changes are the easy part, but I can't think of a straightforward way to expose comments on the output object graph or to make them round-trippable (especially in the face of other structural changes to the intermediate object graph) without basically starting over. If anyone else can, I'm all ears...
not possible yet?
I'd like to express support for this too!
From section 3.2.3.3 of the YAML v1.2 spec: "Comments are a presentation detail and must not have any effect on the serialization tree or representation graph. In particular, comments are not associated with a particular node."
I'm not 100% sure about how to interpret the "must not have any effect on the serialization tree or representation graph" part.
There are scenarios where we need to update only specific values, and try not to change anything else as much as possible. Something works like sed or vim is most expected. Hope this feature can be implemented.
Yaml is used many times as a compromise between readability for users and code. Comments, ordered dicts and other things are an important part of what makes yaml comfortable. In many cases, we use pyYaml to edit existing Yamls. We really need to be able to preserve comments.
No updates about it?
How is it going, is it already on the road map? I would really like this feature and I don't want to migrate to another library just because of it :c
@AdityaSoni19031997 While I would love to see this in pyyaml, so I would have to use two loaders (pyyaml and ruamel.yaml) in ansible-lint, I doubt we will ever see this done pyyaml. Two major things need to happen for this to become tangible: get the concept approved by pyyaml maintainers and have someone really dedicated to implement it. Missing any of this would mean, we will not get it in pyyaml.
Keep in mind that if that is to be implemented it would likely have to be done in two different places, the pure-python variant and also the compiled one, making the goal much harder. To make this goal more achievable, i would say that we might add a special loader that is using pure-python and those that want it could use it. The reality is that the number of consumers that need to parse comments and other information that is not part of the data-model is limited and making pyyaml slower and harder to maintain would be a problem.
We need this feature too 🥺 🥺 🥺
We also need this to preserve comments in yaml for helm charts. Any chances that this feature is placed in roadmap anytime soon?
@nitzmahone has stated multiple times (1, 2) that this is a very large ask, and he's open to the idea if someone is willing to contribute an agreeable design and implementation.
Users who complain about missing features that they care about, commenting things like "what a shame" without offering even the slightest amount of assistance are slowly ruining open-source software.
Let me offer a flow chart:
graph TD
want[I want support for this feature.]
willing{Am I willing to materially<br/>contribute to the project?}
alternative{Am I able to use an<br/>alternative package?}
ok[Please comment with your contributions!]
goalt[Please use an alternative package<br/>which supports your requirements.]
stop[Please do not comment.]
want-->willing
willing--Yes-->ok
willing--No-->alternative
alternative--No-->stop
alternative--Yes-->goalt
Heh, thanks @JonathonReinhart...
I totally get the need- I've had it myself numerous times, and I usually end up reaching for ruamel (or other hacks I'm ashamed of and don't want to discuss). I have actually thrown together a couple of "toy" implementations of comment and formatting metadata preservation for PyYAML over the past few years just to validate my assumptions about the various issues (spoiler: yep, it's even harder than I thought!), but it's pretty difficult to justify any real effort when the current YAML specs explicitly forbid it. I occasionally hear rumblings that YAML 1.3 might soften that stance, but since we're still on an asymptotic approach to completion for our 1.2 support, I wouldn't hold my breath even if it does. :wink:
I work on PyYAML because it's important to my primary project and my employer, who is gracious enough to allow me to spend some of my time giving back to the Python community. For how (relatively) simple Ansible's YAML needs are, and how little of the time I do spend on it directly benefits Ansible, we'd probably be a lot better off just vendoring a cut-down fork of the project, but that's not the way we want to be.
And yeah, I do have to just laugh to myself a bit when folks "threaten" to take their "business" to another Python YAML library.