yaml-cpp
yaml-cpp copied to clipboard
node: id hash_value UnshareSubtrees Deep+ShallowClone ModifyValues
At the cost of exposing some existing detail, enable efficient unsharing of graph structure (DeepClone), and efficient ShallowClone.
Modify[Key]Values allows in-place modification (or deletion) of node_data in a Map. (the Key is not modified but provided so the modification of the value may depend on the key)
UnshareSubtrees for in-place DeepClone copying of shared subtrees.
id and hash_value allow creating efficient input Node -> output Node transformations where the result of a previously transformed subtree may be reused (allowing linear time creation of output trees with shared subtrees instead of exponential by
Rationale: although YAML::Node faithfully represents the input document and allows efficient traversal, using it as a large-scale configuration / AST means that it's easy to unintentionally create quadratic or worse algorithms modifying the tree.
Several of these changes can be made separate pull requests if desired. Tests passed locally and we have our fork is in production successfully. If there's only some subset that you'd like to maintain, we're prepared to keep the fork going, but ideally this work may be useful to others. Also happy to add docs.
This is very interesting, thanks for the PR.
A couple high level-comments that we should work out before going into detail.
-
It's worth being explicit about the problem. Could you give an example (of the bad behavior) in a few lines of code? Or maybe create a github gist with several examples?
-
I'd prefer a factory pattern rather than constructors-with-tags. E.g.
Node ShallowClone(Node)kinda thing. -
The API exposes too many internal details. The
Nodeclass in particular; I'd rather not add any unnecessary public methods. Can you rework it so that's the case?