yaml-cpp icon indicating copy to clipboard operation
yaml-cpp copied to clipboard

Python like yaml.add_constructor to process custom tags

Open ovanes opened this issue 10 years ago • 10 comments

In PyYAML there is a function which allows custom tag callbacks. Is this somehow supported with yaml-cpp?

What I mean is: There is a way to register custom tag like: !include-file and let the callback function to be called when the node is processed, i.e.:

class YAMLProcessor(object):
    def include_file(self, loader, node):
        file_path = node.value
        if not path.isabs(file_path):
            file_path = path.abspath(path.join(self.base_dir, file_path))
        with codecs.open(file_path, mode='r', encoding='utf-8') as input:
            content = input.read()
            return content

    def __call__(self, *args, **kwargs):
        yaml.add_constructor(u'!include-file', self.include_file)

When parsing the YAML input:

text-node: !include-file ./some-content.txt

as soon as the text-node value is parsed the function YAMLProcessor.include_file is called. How can I accomplish that in yaml-cpp?

ovanes avatar Jun 20 '15 21:06 ovanes

yaml-cpp works a little differently from py-yaml.

yaml-cpp's representation layer (turning YAML::Node into a C++ type) mirrors C++'s static typing, so you must tell it what type to turn it into. yaml-cpp provides Node::as<T> to serialize the type:

MyType t = node.as<MyType>();

There are many specializations that are already supported (int, float, string, vector, ...), and you can add your own by specializing the convert<T> template.

See Converting To/From Native Data Types for more details on this.

So if you know that the type you're going to get is a !include-file node, then you can request that from yaml-cpp. If not, then what you're saying is that you want to read dynamically-typed data, for which you must write your own representation layer that suits your needs (yaml-cpp only provides YAML::Node as a dynamically-typed object). pyyaml has the advantage of being written in a dynamically-typed language, so it can fall back to the language itself.

jbeder avatar Jun 21 '15 15:06 jbeder

Thanks for the answer. What you suggest to make is: post-processing, but in pyyaml I am able to make the event-based processing. In my example I don't want to get the !include-file type. I want that type to be processed while parsing the yaml doc, expand the relative path by returning smth. else (here just a string) from the handler/factory function.

ovanes avatar Jun 21 '15 16:06 ovanes

Hmm, I see what you mean.

pyyaml doesn't distinguish between parsing and representation, whereas yaml-cpp does. I assume that in pyyaml, the callback can return any python object? What would you imagine the analog in yaml-cpp would do? Return a scalar string? A string to be further parsed as YAML? A YAML::Node?

jbeder avatar Jun 22 '15 04:06 jbeder

Simply exposing the tag in YAML::Node should be enough for C++ code to handle it as it deems fit I think?

de-vri-es avatar Oct 11 '17 08:10 de-vri-es

So, it seems YAML::Node already exposes the tag with the Tag member function. That should be enough to implement custom tags from C++ I think?

de-vri-es avatar Oct 11 '17 09:10 de-vri-es

Sorry for the comment spam, but what I imagine is missing is something to generically handle something like a !include tag without having to modify the encode/decode functions of each type where you want to support the !include tag.

Not the prettiest solution, but without library support the easiest thing I can think of is a wrapper around node.as<T> which first checks the tag and then delegates to node.as<T>().

In this case the tag can be considered as a pre/post-processing function taking a Node and turning it into another Node. That is something which could be generically supported by yaml-cpp. It could be implemented as an std::map<std::string, std::function<bool (YAML::Node &)>> (or similar). It could even be done as a separate step after parsing:

YAML::TagHandlers tag_handlers; // typedef for an std::map
tag_handlers.insert(std::make_pair("!include", processYamlIncludeTag));

YAML::Node node = YAML::LoadFile("file.yaml");
YAML::ProcessTags(node, tag_handlers);

Note that this can also be implemented completely outside of yaml-cpp currently.

de-vri-es avatar Oct 11 '17 09:10 de-vri-es

One problem of YAML::ProcessTags that I can perceive is that the included yaml may itself have !includetags.

Is it possible to inherit from the YAML::Node class to have YAML::MyNode and overload the operator[] in it to achieve the above? This would return MyNode which would return a new LoadFile based node when it encounters a !include tag.

LoadFile too will have to be replaced by LoadFile2, to return MyNode instead of Node.

iyer-arvind avatar Feb 18 '18 09:02 iyer-arvind

One problem of YAML::ProcessTags that I can perceive is that the included yaml may itself have !includetags.

Why would that be a problem? It can just call itself recursively. I implemented this at work and it works fine.

de-vri-es avatar Feb 18 '18 10:02 de-vri-es

OK, thanks! How does emitting work with this tag? Do you have a fork/link for this implementation?

iyer-arvind avatar Feb 20 '18 10:02 iyer-arvind

@de-vri-es A PR would be much appreciated for this, if it's at all possible.

youngmit avatar Sep 21 '22 23:09 youngmit