rattler-build Idea for better (non-redundant, more compact) branching syntax for selectors?

While thinking about #1154, it occurred to me that it would be even nicer if we could do a match-style syntax that doesn't have to choose a single branch but could apply several!

        files:
          - match: linux
              - something_linux
            match: osx
              - something_osx
            match: unix         # e.g. on linux, this should be taken
              - something_unix  # _in addition_ to the linux-only case
            match: win
              - something_win
            match: unix or win
              - something_that_applies_to_everything
            # no more `else:`, only explicit matches!

I guess it might be a bit late for such deep syntactic changes, but IMO this would make it much nicer to write recipes, rather than having to constantly restart if: branches to avoid exhaustivity-problems.

It would allow writing using a single match: (or whatever you want to call it, like the old sel: before it turned into if:) block for every section of the recipe, which would be much nicer IMO if we're looking at translating things like

examples

this

    - cython                              # [build_platform != target_platform]
    - vtk =*=osmesa_*                     # [build_platform != target_platform and build_variant == "noqt"]
    - vtk =*=qt_*                         # [build_platform != target_platform and build_variant != "noqt"]
    - qt-main                             # [build_platform != target_platform and build_variant == "pyqt"]
    - qt6-main                            # [build_platform != target_platform and build_variant == "pyside6"]

or this

    - crossenv                            # [build_platform != target_platform]
    - maturin >=1.3.2,<2                  # [build_platform != target_platform]
    - {{ stdlib("c") }}
    - {{ compiler('c') }}
    - {{ compiler('cxx') }}               # [win]
    - {{ compiler('rust') }}
    - posix                               # [build_platform == "win-64"]
    - cmake
    - make                                # [unix]
    - cargo-feature                       # [build_platform != target_platform and target_platform == "win-64"]

etc.

The more dimensions (unix vs win, native vs. cross, python version, CUDA or not, implementation variants for openmpi/BLAS/QT/etc.) we need to cover in a given recipe, the harder it gets to do that with simple if: else: branches. Consider something like

        build:
          - match: build_platform != target_platform
              # cross-compilation
            match: build_platform != target_platform and cuda_version
              # cross-compiled CUDA
            match: match(cuda, "<12")
              # CUDA 11.x
            match: match(cuda, ">=12.0")
              # CUDA 12.x
            match: linux
              # linux
            match: osx
              # osx
            match: unix
              # unix
            match: win
              # win
            match: blas_impl == "mkl"
              # mkl

It would be a much lower mental overhead to be able to freely shuffle around lines based on which sections they apply to (e.g. not having to restructure the following into correctly (non-)overlapping if: statements), only adding lines to the selector where they are supposed to be applied.

Nov 02 '24 04:11 h-vetinari

It's completely impossible because duplicate keys are forbidden :(

Nov 02 '24 06:11 wolfv

It's completely impossible because duplicate keys are forbidden :(

Well, how about making each - match: a separate list entry? That's also how multiple if:'s work IIUC.

In other words, the following (note the extra - everywhere)

        build:
          - match: build_platform != target_platform
              # cross-compilation
          - match: build_platform != target_platform and cuda_version
              # cross-compiled CUDA
          - match: match(cuda, "<12")
              # CUDA 11.x
          - match: match(cuda, ">=12.0")
              # CUDA 12.x
          - match: linux
              # linux
          - match: osx
              # osx
          - match: unix
              # unix
          - match: win
              # win
          - match: blas_impl == "mkl"
              # mkl

... should be desugared to the same list as the impossible variant that has multiple match:. The only thing the desugaring would need to learn is to ignore empty lists (for a match: that doesn't apply), though I guess that's already covered by the existing infra (e.g. for an if: without else:)

Nov 02 '24 07:11 h-vetinari

Why not use a top level match with nested cases?

Nov 02 '24 07:11 baszalmstra

Why not use a top level match with nested cases?

I'd be happy with that too!

Nov 02 '24 07:11 h-vetinari

But how is it different from multiple if statements?

Nov 02 '24 07:11 wolfv

You dont need to repeat the negation of the if clauses.

You can more easily write something specific for win, macos, unix without having to repeat not win, etc.

Sorry, im on a phone so its a bit hard to type a proper example.

Nov 02 '24 07:11 baszalmstra

For the record, it's actually possible to use arbitrary Jinja in multi-line YAML strings. I am not sure if we want to keep it, but the following works (after a small bug fix from #1156):

build:
  script: |
    echo "Is ${{ target_platform == "linux-64" }}"
    {% if target_platform == "linux-64" %}
      echo "I AM A LINUX"
    {% elif osx %}
      echo "I AM A MAC"
    {% endif %}

And of course, you could just as well use bash.

So I am not sure if it's worth it to make our syntax more complicated or not ...

Right now it's also possible to do arbitrary things with set and for loops btw. - it just has to be encapsulated in a multi-line string :)

Nov 02 '24 12:11 wolfv

One of the design goals was to make the recipe easier to parse for machines. I think what you are suggesting is taking a step back. Having a match-kinda statement seems like a great (and logical) building block.

Nov 02 '24 13:11 baszalmstra

Yeah but we've never semantically understood the contents of the scripts themselves so I'm not sure if we need that. This doesn't affect the "everything is pure YAML" property.

Nov 02 '24 13:11 wolfv

Sure but just as you mentioned in another issue you just closed, we can validate the control flow statements using the schema. That would become harder and harder the more jinja we start using.

Nov 02 '24 14:11 baszalmstra

If we build our own LSP though .. :) I think I do get your point, but I don't see it super different from e.g. people having issues with their bash scripts inside the script tag.

Nov 02 '24 14:11 wolfv

For the record, it's actually possible to use arbitrary Jinja in multi-line YAML strings. I am not sure if we want to keep it, but the following works (after a small bug fix from #1156):

build: script: | echo "Is ${{ target_platform == "linux-64" }}" {% if target_platform == "linux-64" %} echo "I AM A LINUX" {% elif osx %} echo "I AM A MAC" {% endif %} And of course, you could just as well use bash.

So I am not sure if it's worth it to make our syntax more complicated or not ...

Right now it's also possible to do arbitrary things with set and for loops btw. - it just has to be encapsulated in a multi-line string :)

This does not seem to work:

          {% for bin in qemu_bins %}
            test -f ${QEMU_LD_PREFIX}/{{ bin }}
          {% endfor %}

 │ │ │ + test -f $PREFIX/aarch64-conda-linux-gnu/sysroot/lib64/ld-2.17.so
 │ │ │ + test -f '$PREFIX/aarch64-conda-linux-gnu/sysroot/lib64/{{' bin '}}'
 │ │ │ $PREFIX/etc/conda/test-files/qemu-user-execve-aarch64/0/conda_build.sh: line 14: test: too many arguments

The issue with passing to a bash -c ... is that ${{ xxx }} expands as ["...", "...",] which trips-up bash as it is expecting arrays to be more like (xxx, xxx, ), so this also fails: bash -c 'l=${{ qemu_bins | split(" ") | list }}; for bin in "\${l[@]}"; do test -f ${QEMU_LD_PREFIX}/${bin}; done'

Nov 24 '24 03:11 MementoRC

Where did you set qemu_bins? The context has a behvior where it casts everything to a string.

But if you do:

          {% set qemu_bins = ["foo", "bar"] %}
          {% for bin in qemu_bins %}
            test -f ${QEMU_LD_PREFIX}/{{ bin }}
          {% endfor %}

It has a chance of working :)

Nov 24 '24 09:11 wolfv

Why not use a top level match with nested cases?

Actually, one reason why I do have a preference for bare match:, is that match: - case ... -case suggests disjoint conditions again (at least based on how pattern matching works in most languages), whereas I'd like to not match one specific thing to various disjoint cases, but rather freely match conditions for different quantities that may overlap. Using a different syntax there would avoid triggering some unhelpful muscle memory.

Nov 24 '24 10:11 h-vetinari

@h-vetinari sorry, not following. What's the difference of match then to a number of if clauses if the match' clauses are disjoint? I think what we had in mind so far would be a match that breaks after it finds the first matching entry, e.g.

match:
  - case: unix
    then: 
      - bla
  - case: win and arm64
    then: 
      - foobar
  - case: win
    then:
      - foobar-win64 # NOT happening on arm64

Nov 24 '24 12:11 wolfv

Also your initial example is not at all valid YAML if I see that correctly.

Nov 24 '24 12:11 wolfv

Attached two screenshots highlighting the issues (from https://yamlchecker.com/)

Nov 24 '24 12:11 wolfv

I think what we had in mind so far would be a match that breaks after it finds the first matching entry, e.g.

I explained why that's not desirable in the OP (or at least that there are cases where branches shouldn't be exclusive).

Also your initial example is not at all valid YAML if I see that correctly.

Yes, I hate YAML. 🙃

Nov 24 '24 12:11 h-vetinari

The way I understand it is that you want "if"-syntax without then, correct?

You want falsey branches to be skipped, but every truthy branch to be taken, right?

In e.g. Rust, when usign match, only the first truthy branch is taken. In C/C++ you would have to add a break for this behavior. In Python also only the first truthy value is matched.

But if I understand your proposal correctly, you want every truthy branch to "match"?

Can you add corrected YAML to your proposal? Because I think if you write out proper YAML, it will look just like the current if statements.

Nov 24 '24 13:11 wolfv

It has a chance of working :)

Right! Silly me, I forgot I needed to split (I did know that, just got caught in the Jinja mixin). The variable is in the context section, it is a ${{ [x, y, z] | join() }}

For this particular case, it seems that the rattler/jinja is more complex. The array needs to be joined, then split in script where the {%%} is accepted. Am I missing something?

Also, it seems that the comments are not completely ignored, I had mistakenly put a "xx' in a comment and it seemed to generate a syntax error, that went away after correction (not 100% sure, though)

Update: Not to be difficult, but the initial point was to have the list be defined once for the several packages that use it. I don't think it will work: {% for bin in {{ qemu_bins | split(' ') | list }} %} fails to be parsed and a global {% fails with:

Error:   × Parsing: failed to parse YAML: unexpected character: `%'
   ╭─[1:2]
 1 │ {% qemu_bins_j = [
   ·  ────────┬────────
   ·          ╰── here
 2 │       "bios.bin", "bios-256k.bin", "bios-microvm.bin", "qboot.rom", "vgabios.bin", "vgabios-cirrus.bin"
   ╰────

So I will use python -c ... ${{ qemu_bins }}.split() ..., which is neat enough I concede

Nov 24 '24 15:11 MementoRC

FWIW I ended up with this for a similar problem (multiple if else):

  build_num: >
    ${{ (build_pre | int + 2000) if mpi == "openmpi"
    else ((build_pre | int + 1000) if mpi == "mpich"
    else ((build_pre | int) if mpi == "nompi" else 0)) }}

May 20 '25 23:05 HaoZeke

Yes, even syntax such as

build_num: >
   {% if mpi == "openmpi" %}
       ${{ build_pre | int + 2000 }}
   {% else if ...

Would likely work. Not sure if we should endorse it, though :)

May 21 '25 08:05 wolfv

this may be specific enough to sha256 to warrant a new issue but I wonder whether there is a nicer way to write this:

source:
  - if: numpy_min == "2.3"
    then: 
    - url: ${{ source_url }}
      sha256: eb30717dc7390a038c2247be1a7e8bf1a48b8a4701a01960804830a231631bef
  - if: numpy_min == "2.2"
    then: 
    - url: ${{ source_url }}
      sha256: f6b9a08193c6291c3d6c4717d1ee1b0a8bf5050765165175affde7afaab1639f
  - if: numpy_min == "2.1"
    then: 
    - url: ${{ source_url }}
      sha256: d61dd84290c6faaa35007346b79741e1e100915885a5a5a63d05b401eddbf583
  - if: numpy_min == "2.0"
    then: 
    - url: ${{ source_url }}
      sha256: ba6a23a9d07bb782bda4f5852b476de4e16bad8f979c96e3df1d04e6bb6304c7
  - if: numpy_min == "1.25"
    then: 
    - url: ${{ source_url }}
      sha256: 81f00561a6f0b92436b4c0b64dc9da25a8a31154a4fb0813e3337b2a0afd7db3
  - if: numpy_min == "1.23"
    then: 
    - url: ${{ source_url }}
      sha256: 2deb6a9b0801e23c0597f3654d667238df581a85115d9f813d12ff0afcc842d3
  - if: numpy_min == "1.22"
    then: 
    - url: ${{ source_url }}
      sha256: 0741447dbbaeb3db069341b53903f16d076fcdef931c0190fb15e5bc004463c9

templating directly into the sha256 field seems unsupported

Aug 12 '25 22:08 lucascolley

It's possible to template the SHA256 but our current JSON schema will complain because it expects a hex string of 32 characters there. We can totally change the schema to also accept Jinja expressions (string starts / ends with ${{ ... }}).

I think the most ideal scenario for this case would be if you had a dictionary, so that you could do:

- url: ${{ source_url }}
  sha256: ${{ hashes[numpy_min] }}

We migth be able to do that with some trickery :)

Aug 13 '25 08:08 wolfv

ah great, I mistakenly thought that was more than just the JSON schema complaining. In which case variants.yaml can handle it all nicely:

zip_keys:
- [numpy_min, numpy_cap, python_min, source_hash]

numpy_min:  [ 2.3,  2.2,  2.1,  2.0, 1.25, 1.23, 1.22]
numpy_cap:  [ 2.4,  2.3,  2.2,  2.1,  2.0, 1.25, 1.23]
python_min: [3.11, 3.10, 3.10, 3.10, 3.10, 3.10, 3.10]
source_hash:
- eb30717dc7390a038c2247be1a7e8bf1a48b8a4701a01960804830a231631bef
- f6b9a08193c6291c3d6c4717d1ee1b0a8bf5050765165175affde7afaab1639f
- d61dd84290c6faaa35007346b79741e1e100915885a5a5a63d05b401eddbf583
- ba6a23a9d07bb782bda4f5852b476de4e16bad8f979c96e3df1d04e6bb6304c7
- 81f00561a6f0b92436b4c0b64dc9da25a8a31154a4fb0813e3337b2a0afd7db3
- 2deb6a9b0801e23c0597f3654d667238df581a85115d9f813d12ff0afcc842d3
- 0741447dbbaeb3db069341b53903f16d076fcdef931c0190fb15e5bc004463c9

Aug 13 '25 08:08 lucascolley

More pedestrian than all the syntax stuff discussed so far, but I was surprised that there's no elif: <cond> to go along with

if: <cond>
then: <stuff>
else: <stuff>

(not mentioned in the docs, and crashes an actual recipe where I tried this)

Nov 07 '25 04:11 h-vetinari

@h-vetinari we already discussed that it's invalid YAML right?

Nov 07 '25 09:11 wolfv

A single elif: surely isn't illegal, though I could see how multiple would be problematic.

I don't think it was discussed so far though, at least not with me.

Nov 07 '25 10:11 h-vetinari