pyyaml-include
pyyaml-include copied to clipboard
Relative include: relativeto the current file?
I think (IMHO) it would make more sense if relative include searches for the file in the current file's directory rather than current working directory.
Use case: when I pass config files in the command line I might have them in different directories. And if they're independent - they could load their own local extensions.
The solution to not break backward compatibility would be, for example to add a new parameter to the constructor or to initialize base_dir=None instead of default empty string.
EDIT: perhaps this is not possible to get information about current file from the node object passed to the constructor?
But as soon as we can redefine reader - I see that yaml.Reader class can handle stream names for file-like objects.
@ba1dr Thanks for your advice!
Yes, it's nice and natural to parse the relative file from where the current YAML is.
But i can not to get information about current file, as you wrote.
For the case of that "config files in different directories.", i think current API can't do that beautifully, but to use absolute path is workable.
jinjyaml is a Jinja2 template engine integration for PyYAML.
We can include files by Jina2's include instruction:
Consider we have below YAML:
parent: !j2 |
{% include "child-1.yml" %}
{% include "child-2.yml" %}
then execute:
import jinja2
import jinjyaml
j2_env = jinja2.Environment(
loader = jinja2.FileSystemLoader(searchpath=your_base_dir)
)
j2_ctor = jinjyaml.Constructor()
yaml.add_constructor('!j2', j2_ctor)
doc = yaml.full_load(yaml_string)
data = jinjyaml.extract(doc, env=j2_env)
Jinja2's FileSystemLoader would load child-1.yml and child-2.yml, relative to it's search path.
And we can even write a custom Jinja2 file loader, for particular purpose.
Hmm, no, I think Jinja2 would be an overkill. If using template engines - I'd better use config file on Python generated with Jinja2 rather than yaml. Even with this include feature I am not sure if it is a good idea to use it as it breaks compatibility with other languages or scripts that do not support this tag..
Perhaps a YMAL's Json Pointer (if there be one) could be more fit for the case.
In case anyone's interested, my current workaround is this:
import contextlib
import os
import pathlib
import yaml
from yamlinclude import YamlIncludeConstructor
YamlIncludeConstructor.add_to_loader_class(loader_class=yaml.SafeLoader)
@contextlib.contextmanager
def working_directory(path: pathlib.Path):
prev_cwd = pathlib.Path.cwd()
os.chdir(path)
try:
yield
finally:
os.chdir(prev_cwd)
def load_config_file(file_path: pathlib.Path):
with working_directory(file_path.parent):
with file_path.open("r") as config_file:
return yaml.safe_load(config_file)
But obviously, the limitation is that any 2nd+ level include is relative to the first file, not any intermediate files, but luckily that's good enough for us right now :slightly_smiling_face:
But i can not to get information about current file, as you wrote.
Actually @tanbro, you can! :)
One just had to change the base_dir as they travel along, and extract the name of the file from the stream, then patch yaml.load specifically.
EDIT: Updated the snippet, this is what we now use internally in an __init__.py.
import yaml
from yamlinclude import YamlIncludeConstructor
YamlIncludeConstructor.add_to_loader_class(loader_class=yaml.FullLoader)
YamlIncludeConstructor.add_to_loader_class(loader_class=yaml.SafeLoader)
YamlIncludeConstructor.add_to_loader_class(loader_class=yaml.Loader)
YamlIncludeConstructor.add_to_loader_class(loader_class=yaml.BaseLoader)
include_tag = YamlIncludeConstructor.DEFAULT_TAG_NAME
yaml_load = yaml.load # Save original load function
def load_yaml(stream, Loader):
from pathlib import Path
path = Path(stream.name)
if include_tag not in Loader.yaml_constructors:
return yaml_load(stream, Loader=Loader)
previous_base = Loader.yaml_constructors[include_tag].base_dir
Loader.yaml_constructors[include_tag].base_dir = path.parent.as_posix()
res = yaml_load(stream, Loader=Loader)
Loader.yaml_constructors[include_tag].base_dir = previous_base
return res
yaml.load = load_yaml # Use new one
del YamlIncludeConstructor
del yaml
The above would fail on strings (if used with e.g. yaml.load(f.read()), or some local definitions).
One can add an isinstance(stream, io.TextIOWrapper) for validation as needed.
EDIT: Like so:
yaml_load = yaml.load # Save original load function
def load_yaml(stream, Loader):
from pathlib import Path
from yamlinclude import YamlIncludeConstructor
from io import TextIOWrapper
tag = YamlIncludeConstructor.DEFAULT_TAG_NAME
if tag not in Loader.yaml_constructors or not isinstance(stream, TextIOWrapper):
# If tag is included in the stream but we can't get the file location, we can't assume
# anything about the relative file location
return yaml_load(stream, Loader=Loader)
path = Path(stream.name)
previous_base = Loader.yaml_constructors[tag].base_dir
Loader.yaml_constructors[tag].base_dir = path.parent.as_posix()
res = yaml_load(stream, Loader=Loader)
Loader.yaml_constructors[tag].base_dir = previous_base
return res
yaml.load = load_yaml # Use new one
@ba1dr @1ace You might be interested ^
#26 provides a way to include files relatively