type of context vars inconsistent based on line continuation
I'm working on a recipe with a derivative context variable that should be an integer. I'm discovering that if I split the expression using a multi-line continuation like >-, which produces the exact same parsed yaml, the resulting variable changes type from int to string.
Example:
context:
build: 1
build_plus: ${{ build + 1 }}
build_plus_2: >-
${{ build + 1 }}
In the data structure, build_plus and build_plus_2 ought to be identical:
from ruamel.yaml import YAML
with open("recipe/recipe.yaml") as f:
recipe = YAML().load(f)
recipe['context']['build_plus'] == recipe['context']['build_plus_2']
but in the computed context build_plus is an integer, while build_plus_2:
rattler-build build --render-only -r recipe | jq '.[0].recipe.context'`
{
"build": 1,
"build_plus": 2,
"build_plus_2": "2",
}
I didn't expect splitting an expression across lines for readability without changing the yaml output to change the type of variables in the recipe.
This is the expected outcome AFAIK.
In YAML, the line continuation stuff (| or >) is for strings only AFAIK.
So the sequence of steps is
build_plus_2: >-
${{ build + 1 }}
gets the jinja2 applied to produce
build_plus_2: >-
2
which YAML then properly interprets as a string split across lines.
There is no way to "split an integer across lines" YAML.
Stated more simply, as soon as you use > or | you are explicitly casting the thing as a string in the same way that using single or double quotes would.
Yeah, @beckermr is right. We take the type from the "YAML" type, not the Jinja type. I think that's to some extent confusing, but also extremely hard to fix...
I think I don't quite understand how/when substitutions happen. Putting it in quotes also makes in unambiguously a string (I should have checked this first), which makes the continuation effect make sense, too. I had thought jinja interpretation happened entirely after parsing yaml in v1, where all templates are obviously always strings, so the yaml serialization of identical strings wouldn't affect the outcome of the template, but that's clearly not the case. I think I know what to expect now, though I don't know how or why, it's just not what I expected.
Let me re-open this as I think there is a fair point. We could indeed forego the YAML parsing and detect whether a given string is only a Jinja expression (starts with ${{ and ends with }}). And if that is the case, then we could try to figure out the type that results from evaluating the expression vs. re-parsing the YAML.
@wolfv Good idea, I suspect that would match what I was expecting to happen.
I think there might be some docs to add mentioning how the final result is processed into the recipe. It's not clear when things are coerced to/from strings (e.g. ${{ '1' }} is an integer, not the string result of the expression). In their absence, I happened to guess wrong about how an expression is parsed into a context variable.
Some examples of non-string results and how they are handled would be great. For example, with the recipe:
context:
false_literal: false
false_literal_str: 'false'
false_jinja: ${{ false }}
false_jinja_str: ${{ 'false' }}
it's not immediately clear that both false_jinja and false_jinja_str context vars evaluate to the bool false, despite false_jinja_str being explicitly a string-typed expression. I think I get that string expressions are parsed as yaml, thus stripping any quotes, but that's not documented as far as I can tell. It's certainly not the case that you can safely assume that the type of the variable matches the type of the expression.
Indeed, the string should not coerce to a boolean value - that's definitely a bug.
What do you both think about the solution discussed in #1612 ?
We would treat Jinja expressions differently from strings with Jinja, and try to make things more "type-safe".
I think preserving the expression type into the yaml would make it a lot clearer and easier to reason about.
Unfortunately rattler-build-conda-compat does set properly types when rendering context sections via https://github.com/prefix-dev/rattler-build-conda-compat/blob/df3646efc79435a5f07e73c67be248f97c41d888/src/rattler_build_conda_compat/jinja/jinja.py#L91. The issue there is that the rendered text is not then put through the yaml type inference.
In other words, this function:
def load_recipe_context(context: dict[str, str], jinja_env: jinja2.Environment) -> dict[str, str]:
"""
Load all string values from the context dictionary as Jinja2 templates.
Use linux-64 as default target_platform, build_platform, and mpi.
"""
# Process each key-value pair in the dictionary
for key, value in context.items():
# If the value is a string, render it as a template
if isinstance(value, str):
template = jinja_env.from_string(value)
rendered_value = template.render(context)
context[key] = rendered_value
return context
should actually do something like this
def load_recipe_context(context: dict[str, str], jinja_env: jinja2.Environment) -> dict[str, str]:
"""
Load all string values from the context dictionary as Jinja2 templates.
Use linux-64 as default target_platform, build_platform, and mpi.
"""
# Process each key-value pair in the dictionary
for key, value in context.items():
# If the value is a string, render it as a template
if isinstance(value, str):
template = jinja_env.from_string(value)
rendered_value = template.render(context)
yaml_str = "value: " + rendered_value
yaml_dict = yaml.load(yaml_str)
context[key] = yaml_dict["value"]
return context
I am still of the opinion that treating jinja2 as anything other than producing strings is going to cause pain. If we deviate from the standards, this causes pain since people cannot simply use any old off-the-shelf jinja2 or yaml parser. We already have enough pain with various bool-as-string to bool conversions across various tools. I do not know why we would want to leave more of those footguns lying around.