rattler-build fix: special case jinja

There is a confusing behavior about types with rattler-build and minijinja.

When we have something like:

context:
  variable: ${{ "1234" }}

Then in the currently released version of rattler-build, 1234 will become an integer because the Jinja evaluates to 1234 which will be interpreted by YAML as a integer.

However, we can fix this by special casing Jinja expressions (starting with ${{ and ending with }}). We can fix this by evaluating the Jinja as an expression and look at the Jinja type that comes out.

If it is a string, we can then quote the string and forbid YAML to coerce this type, so that a string stays a string.

This will also fix this situation:

context:
  env_var: ${{ env.get("INTEGER") }}  # <- now this will always be a string, if not explicitly casted

Where we may run into trouble is with something like:

context:
  joined: ${{ "1" }}234

This would be evaluated as integer 1234 with the current code.

However, we could also decide to allow booleans or numbers to only come from simple Jinja expressions and not proper "templates" (ie. always cast anything more complex than a single expression to string).

May 02 '25 17:05 wolfv

With this we could also revisit https://github.com/prefix-dev/rattler-build/issues/971 and https://github.com/prefix-dev/rattler-build/issues/1423

May 03 '25 07:05 wolfv

We can change this further to allow Jinja expressions to return list and map objects to YAML, and then ingest them as lists or YAMLs from Jinja.

We could then also make the following use case work, where a user expands a list into YAML:

context:
  mylist: [a,b,c]

requirements:
  run: 
    - ${{ mylist }}  # should expand to a SequenceNode and not a ScalarNode

May 03 '25 15:05 wolfv

Using this recipe:

context:
  cuda_version: ${{ env.get("RAPIDS_CUDA_VERSION") }}
  cuda_major: ${{ (env.get("RAPIDS_CUDA_VERSION") | split("."))[0] }}

package:
  name: test-package
  version: 1.0

build:
  script:
    - if: cuda_major == "12"
      then:
        - echo "cuda_major is a STRING and is 12"
      else:
        - echo "cuda_major is NOT a STRING or is not 12 (value was ${{ cuda_major }})"

we still cannot get the 12 as string, i am still working on it to check if we can make a solution

$env:RAPIDS_CUDA_VERSION="12.8.0"; cargo run -- build --recipe recipe.yaml

on powershell

May 07 '25 10:05 zelosleone

Indeed, it seems to go wrong somewhere in the code.

context:
  cuda_version: ${{ env.get("RAPIDS_CUDA_VERSION") }}
  cuda_major: ${{ (env.get("RAPIDS_CUDA_VERSION") | split("."))[0] }}
  cuda_major_is_string: ${{ (env.get("RAPIDS_CUDA_VERSION") | split("."))[0] is string }}

Running cargo r -- build --recipe ./examples/ajinja --render-only

Gives us:

[
  {
    "recipe": {
      "schema_version": 1,
      "context": {
        "cuda_version": "12.8.0",
        "cuda_major": 12,
        "cuda_major_is_string": true
      },
      "package": {
        "name": "test-package",
        ...

May 07 '25 12:05 wolfv

@wolfv There is a huge issue that is not producable in WSL and windows, but it is in all github runners, which is mainly about the unit tests (pkg_hash is the most serious one) I tried to solve it, change the formatters for hash, change the calculation logic and none of them passes. It also doesn't recognize

${{ PYTHON }} --help

now, which is another test fails now. Tried to fix that as well but the only solution i could find for 5 hours :D was making it

- python

May 07 '25 18:05 zelosleone

@beckermr the problem that we are fixing is the following:

foobar: ${{ "1234" }} <- would evaluate to an integer

This is quite surprising I find. Having to wrap Jinja expressions in quotes to force a string is more cumbersome IMO. The code in this PR works quite well and is less surprising when dealing with env vars and similar things.

May 07 '25 21:05 wolfv

Yes. Jinja always does string substitutions. The types of items do not cross the boundary of the ${{ }} expression.

The issue with having the types cross is that for people who use jinja properly as a string templating, it becomes ambiguous and confusing.

This case is now really hard to predict:

var: "${{ "1234" }}"

Does the types cross the boundary of the jinja quote the internal quotes this pr implicitly adds? Does it ignore them? Does it turn them to single quotes? Does it render a nonsense yaml value? Why is one of these obviously the answer when the others are not?

This PR actually makes the interaction of the jinja templating with the yaml LESS clear.

The rule for jinja2 is simple:

"jinja produces strings that are inserted into the yaml."

Anything else is more complicated and surprising as you work through the edge cases.

May 07 '25 21:05 beckermr

For this case:

var: "${{ "1234" }}"

The (outer) quotes will always win and force whatever is inside to be a string.

The more ambigous case (but not really, due to outer quotes) would be:

var: "${{ 1234 }}"

where the inner type is an integer but the quotes cast it to a string.

May 07 '25 21:05 wolfv

Yep. Those examples exactly illustrate my point.

I have written out explicitly how the jinja+yaml works for four cases below.

a

var: ${{ 1234 }}

renders using the following steps

jinja parses+evals the text inside ${{ }} (it results in an int 1234)
jinja renders the final value to a string "1234"
jinja inserts the value of the string into the YAML This yields
```
var: 1234
```
YAML parses var: 1234 to a dict with a key var and a value 1234 as an int

b

var: "${{ "1234" }}"

renders using these steps

jinja parses+evals the text inside ${{ }} (it results in a string "1234")
jinja renders the final value to a string (since step 1 yielded a string already, this step is a no-op)
jinja inserts the value of the string into the YAML This yields
```
var: "1234"
```
YAML parses var: "1234" to a dict with a key var and a value "1234" as a string

c

var: "${{ 1234 }}"

renders using these steps

jinja parses+evals the text inside ${{ }} (it results in an int 1234)
jinja renders the final value to a string "1234"
jinja inserts the value of the string into the YAML This yields
```
var: "1234"
```
YAML parses var: "1234" to a dict with a key var and a value "1234" as a string

d

var: ${{ "1234" }}

renders using the following steps

jinja parses+evals the text inside ${{ }} (it results in a string "1234")
jinja renders the final value to a string (since step 1 yielded a string already, this step is a no-op)
jinja inserts the value of the string into the YAML This yields
```
var: 1234
```
YAML parses var: 1234 to a dict with a key var and a value 1234 as an int

This PR would change the behavior of case d to insert a quoted string instead of the value of the string (so step 3 changes) based on the type of the underlying jinja2 expression (computed in case 1).

However, if that rule is applied consistently, then case b should yeild var: ""1234"" with the extra set of quotes since in case b the type of the jinja2 expression ends up as a string in step 1.

The only way to tell b should be different than d in an automated way is to figure out what type YAML would parse and then account for that. That extra logic where the jinja2 parser has to reach out beyond itself to figure out why type to return breaks the clean separation between the jinja2 parsing and the yaml parsing.

The downstream consequence is that we've deviated from the standard behavior for these tools meaning that

Folks who pick up off-the-shelf jinja2 and yaml parsers will get different results compared to rattler-build.
Existing code bases that parse v1 recipes (like the bot, conda-recipe-manager, conda-smithy etc.) would need complicated special-casing / behavior to implement this rule.
People who understand how templating engines work will be surprised by the results.

This entire thing reminds me of selectors in conda build v0 recipes. Instead of using the tools as intended, extra syntax and meaning was introduced by using python eval in comments in yaml. This choice caused non-trivial issues, work, and pain for the entire ecosystem. Let's please not make this mistake again.

May 07 '25 22:05 beckermr

I think I can live with your point @beckermr.

However, one thing that doesn't work right now is:

context:
   specs: [a,b,c]

requirements:
  run:
    - ${{ specs }}

And I think we should make that work (this will be currently interpreted as - "[a, b, c]"

The other thing that will be much harder to make work is:

context:
   toml_contents: ${{ load_from_file("pyproject.toml") }}

We could potentially serialize the file as JSON on a single line and insert it that way into the YAML to get something that we could later index into. But not sure if that is a great idea.

One workaround would be something like:

context:
  _loader: |
    {% set toml = load_from_file("pyproject.toml") %}
  version: ${{ toml.version }}

But we don't advertise this capability widely :)

May 16 '25 12:05 wolfv

Unfortunately these special cases are also outside standard jinja2 and yaml behavior. Thus I don't think you can change their behavior without the same consequences as above.

May 16 '25 13:05 beckermr