pyyaml icon indicating copy to clipboard operation
pyyaml copied to clipboard

how to include yaml files?

Open flixman opened this issue 2 years ago • 2 comments

I would like to be able to parse the follow yaml file, A.yml, that include B.yml: A.yml

---
all:
   children:
      GROUP1:
         <<: !include B.yml
      C: !include B.yml

B.yml

---
all:
   children:
      GROUP1:
         hosts:
            host1:

      GROUP2:
         hosts:
            host2:

To this end, I have managed to code the following, adding the following lines to the default constructor:

                    if isinstance(value_node, ScalarNode) and value_node.tag == '!include':
                        with open(value_node.value, 'rb') as fp:
                            merge.extend(yaml.compose(fp).value)
                    elif ...

I am subclassing SafeConstructor and overriding the default flatten_mapping method:

class MySafeConstructor(yaml.constructor.SafeConstructor):
    def flatten_mapping(self, node):
            merge = []
            index = 0
            while index < len(node.value):
                key_node, value_node = node.value[index]
                if key_node.tag == 'tag:yaml.org,2002:merge':
                    del node.value[index]
                    if isinstance(value_node, ScalarNode) and value_node.tag == '!include':
                        with open(value_node.value, 'rb') as fp:
                            merge.extend(yaml.compose(fp).value)
                    elif isinstance(value_node, MappingNode):
                        self.flatten_mapping(value_node)
                        merge.extend(value_node.value)
                    elif isinstance(value_node, SequenceNode):
                        submerge = []
                        for subnode in value_node.value:
                            if not isinstance(subnode, MappingNode):
                                raise ConstructorError("while constructing a mapping",
                                                       node.start_mark,
                                                       "expected a mapping for merging, but found %s"
                                                       % subnode.id, subnode.start_mark)
                            self.flatten_mapping(subnode)
                            submerge.append(subnode.value)
                        submerge.reverse()
                        for value in submerge:
                            merge.extend(value)
                    else:
                        raise ConstructorError("while constructing a mapping", node.start_mark,
                                       "expected a mapping or list of mappings for merging, but found %s"
                                       % value_node.id, value_node.start_mark)
                elif key_node.tag == 'tag:yaml.org,2002:value':
                    key_node.tag = 'tag:yaml.org,2002:str'
                    index += 1
                else:
                    index += 1
            if bool(merge):
                node.merge = merge  # separate merge keys to be able to update without duplicate
                node.value = merge + node.value


class Loader(Reader, Scanner, Parser, Composer, MySafeConstructor, Resolver):

    def __init__(self, stream):
        Reader.__init__(self, stream)
        Scanner.__init__(self)
        Parser.__init__(self)
        Composer.__init__(self)
        MySafeConstructor.__init__(self)
        Resolver.__init__(self)

        yaml.add_constructor('!include', Loader.constructor_include, Loader)

    @staticmethod
    def constructor_include(loader: yaml.Loader, node: yaml.Node):
        with open(node.value, 'rb') as _f:
            return yaml.load(_f, Loader)


if __name__ == '__main__':
    with open('A.yml', 'r') as f:
        data = yaml.load(f, Loader)
    print(yaml.dump(data))

This produces the result I am looking forward, but seems quite cumbersome. Might somebody know if there is some more elegant way to sort this out? Thank you!

flixman avatar May 04 '22 13:05 flixman

Have a look at https://pypi.org/project/pyyaml-future/ ( https://github.com/yaml/pyyaml-future ).

It's a hack I made last year to do some things that might become (a configurable (not default)) part of pyyaml in the next release or two...

ingydotnet avatar May 04 '22 23:05 ingydotnet

The problem is that PyYAML does not expect the value of the merge key to have a custom constructor. And t the time flatten_mapping is called, the constructor has not yet been called, and it will never be called. I think it should first construct the value and then check if it is a dictionary or a list. Maybe this can be fixed in a general way by calling construct_object in flatten_mapping?

@flixman btw be careful when writing such an include constructor, especially when loading untrusted data. avoid endless loops (including the same file over and over again) and forbid absolute paths and ../ by default (!include /etc/passwd).

I wrote such a plugin for YAML::PP (perl), and it handles all that, plus it works when with the merge key. We have that use case at work where we use my module, and it's disappointing that this is not possible with PyYAML.

perlpunk avatar Jul 24 '22 12:07 perlpunk