snakemake-github-action icon indicating copy to clipboard operation
snakemake-github-action copied to clipboard

Feature request: Option to make failure == success

Open hepcat72 opened this issue 2 years ago • 2 comments

I have a couple of error check rules to ensure a correct conig setup. I give those rules a higher priority so that they run first and it fails fast. It would be great if I could create a test using the snakemake giuthub action that ensures that a test with a bad config does in fact exit non-zero.

Here are 2 examples of such rules:

rule check_peak_read_len_overlap_params:
    # This code throws an error if the fraction of the minimum peak (or
    # summit) width (based on MAX_ARTIFACT_WIDTH and SUMMIT_FLANK_SIZE) over
    # the max read length is less than the fracOverlap (FRAC_READ_OVERLAP) that
    # is supplied to featureCounts.
    input:
        "results/QC/max_read_length.txt",
    output:
        "results/QC/parameter_validation.txt",
    params:
        frac_read_overlap=FRAC_READ_OVERLAP,
        max_artifact_width=MAX_ARTIFACT_WIDTH,
        summit_flank_size=SUMMIT_FLANK_SIZE,
    log:
        "results/QC/logs/parameter_validation.log",
    conda:
        "../envs/python3.yml"
    script:
        "../scripts/check_peak_read_len_overlap_params.py"


rule check_biological_replicates:
    # There is no input or output for this rule.  It depends on the params.  The log.status file is included in the all rule when metadata is present (see common.smk).
    params:
        conditions="\n\t".join(
            [
                f"{nonrep['dataset']}:{nonrep['experimental_condition']}"
                for nonrep in SAMPLES_WITHOUT_BIOLOGICAL_REPLICATES
            ]
        ),
        samples="\n\t".join(
            [
                f"{nonrep['biological_sample']}:{','.join(nonrep['sample_ids'])}"
                for nonrep in SAMPLES_WITHOUT_BIOLOGICAL_REPLICATES
            ]
        ),
    priority: 1
    log:
        err="results/logs/sample_status.err",
        status="results/logs/sample_status.txt",
    conda:
        "../envs/bedtools_coreutils_gawk_gzip.yml"
    shell:
        """
        if [ "{params.conditions}" == "" ]; then \
            echo "STATUS=GOOD. No non-replicates detected with metadata." > {log.status:q}; \
            touch {log.err:q}; \
        else \
            echo "STATUS=BAD. Non-replicates detected with metadata." > {log.status:q}; \
            MSG="NONREPLICATES DETECTED ERROR: Biological replicates are required in each experimental condition. The following 'dataset:experimental_condition's have only a single biological sample:\n\n\t"; \
            MSG="${{MSG}}{params.conditions}\n\n"; \
            MSG="${{MSG}}There are 4 ways to deal with this error:\n\n"; \
            MSG="${{MSG}}1. Add samples to each of the conditions listed above (or add the experimental conditions to exiting samples that do not currently have an annotated experimental condition).\n"; \
            MSG="${{MSG}}2. Remove all metadata from the sample sheet (except sample_id and dataset) for the following biological samples:IDs:\n\n\t"; \
            MSG="${{MSG}}{params.samples}\n\n"; \
            MSG="${{MSG}}3. Remove all rows from the sample sheet for the biological samples:IDs shown above under option 2.\n"; \
            MSG="${{MSG}}4. Skip the replicate error check by adding the following option to your snakemake command:\n\n\t--omit-from check_biological_replicates\n\n"; \
            printf "$MSG" > {log:q}; \
            exit 1; \
        fi
        """

I have worked around this issue by using the base environment's install of snakemake, but it would be great if I didn't have to do that:

---
name: Test Snakemake Fails

"on": push

jobs:
  run-lint:
    runs-on: ubuntu-latest
    defaults:
      run:
        # This enables running of conda(-installed) commands (e.g. `snakemake`)
        # in the rules below.  See:
        # https://github.com/conda-incubator/setup-miniconda/issues/128
        shell: bash -l {0}

    steps:

      - name: Checkout code
        uses: actions/checkout@v3
        with:
          fetch-depth: 0

      - name: Install dependencies
        uses: conda-incubator/setup-miniconda@v2
        with:
          python-version: "3.9"
          auto-update-conda: true
          use-mamba: true
          mamba-version: "*"
          miniforge-variant: Mambaforge
          auto-activate-base: false
          environment-file: environment.yml
          channel-priority: true
          activate-environment: ATACCompendium
      - name: Display all conda & env info
        run: |
          conda info -a
          conda list
          conda config --show-sources

      - name: Ensure error when any experimental condition contains no biological replicates
        run: |
          bash scripts/test_snakemake_fails.sh "$CONDA" --use-conda --cores 2 --directory .tests/test_6_missingreplicates

hepcat72 avatar Aug 04 '23 15:08 hepcat72

Dear Robert, I'm not sure I fully understand your setup. Won't the tests fail as desired with your exit 1 statement? Why would you need a change to the github action, if this problem can be solved on workflow level?

m-jahn avatar Nov 04 '25 15:11 m-jahn

I can see why you're having trouble following the setup. I clearly didn't provide enough good info here to sufficiently describe what it was I was requesting. It's been so long, I had trouble figuring out exactly what feature it is I was requesting myself, (and I haven't worked with snakemake since, so I'm rusty and this tool awhile to reverse-engineer my question...). But I looked in my repo at the shell script I was calling (test_snakemake_fails.sh) and it has this:

#!/usr/bin/env bash

# USAGE: ./scripts/test_snakemake_fails.sh /path/to/conda/install/dir <all snakemake options>

# This is intended to be used by .github/workflows/test-snakemake-fails.yml
# $CONDA (passed in as the first argument) is an environment variable setup by github actions

set +e

if [ "$1" != "" ]; then
  CONDABIN="$1/bin"
  PATH="$CONDABIN:$PATH"
fi

shift

if snakemake -q all "$@";
then
  1>&2 echo "Test failed.  Snakemake did not exit with a non-zero exit status as expected: "$?
  exit 1
else
  echo "Test succeeded.  Snakemake exited with expected non-zero status: "$?
  exit 0
fi

Based on that script and the title of this feature request: "Feature request: Option to make failure == success", what I am asking for here is a way to test for expected failure of a snakemake to indicate a successful test. I.e. I want to ensure that the workflow fails under certain conditions. The problem is that I cannot do that without wrapping such a test in a shell script and use the snakemake command installed in the environment. Ideally, I could create a test like this that doesn't explicitly call the snakemake command, like this:

# This runs snakemake tests.
# Requirements and design can be found in the associated issue:
# https://github.com/PrincetonUniversity/ATACCompendium/issues/38
---
name: Functional Tests

"on": [push, workflow_dispatch]

jobs:
  complete-workflow:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout Codebase
        uses: actions/checkout@v3
      - name: Test Entire Snakemake Workflow
        uses: snakemake/snakemake-github-action@v1
        with:
          directory: '.tests/test_1'
          snakefile: 'workflow/Snakefile'
          args: '--use-conda --cores 2 --show-failed-logs --printshellcmds'

And if I were to envision how such a test for failure might look, it might be to add an argument under the with statement named expected_exit_status like this (supporting both any non-zero exit status or a specific error status):

# This runs snakemake tests.
# Requirements and design can be found in the associated issue:
# https://github.com/PrincetonUniversity/ATACCompendium/issues/38
---
name: Functional Tests

"on": [push, workflow_dispatch]

jobs:
  complete-workflow:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout Codebase
        uses: actions/checkout@v3
      - name: TEST FOR SPECIFIC FAILURE
        uses: snakemake/snakemake-github-action@v1
        with:
          directory: '.tests/test_1'
          snakefile: 'workflow/Snakefile'
          args: '--use-conda --cores 2 --show-failed-logs --printshellcmds'
          expected_exit_status: 2

      - name: TEST FOR ANY FAILURE
        uses: snakemake/snakemake-github-action@v1
        with:
          directory: '.tests/test_1'
          snakefile: 'workflow/Snakefile'
          args: '--use-conda --cores 2 --show-failed-logs --printshellcmds'
          expected_exit_status: -1

where an expected_exit_status of -1 means "non-zero".

hepcat72 avatar Nov 04 '25 17:11 hepcat72

Dear Robert, thanks for clarification! I see now what you mean, you want a particular test case to fail. However I think this is a very specific case, I'm not sure if it's worth the overhead of implementing the exit status. If you like you can make a PR though. If not, maybe we can close the issue.

m-jahn avatar Nov 05 '25 08:11 m-jahn

I'd say this is just an example and that being able to test error cases is fairly generally applicable, but it's up to you. The work-around works.

hepcat72 avatar Nov 05 '25 09:11 hepcat72