cli icon indicating copy to clipboard operation
cli copied to clipboard

[build] Snakemake will auto-detect an incorrect number of CPUs on AWS Batch when `--cpus` is not used

Open tsibley opened this issue 3 years ago • 3 comments

Current Behavior

In the absence of a --cpus option provided to nextstrain build, the AWS Batch runner uses the job definition's default CPU allocation (vcpus) which is not detectable by Snakemake. Instead, Snakemake detects and uses the number of CPUs on the entire Batch node.

I expect this will cause oversubscription for the actual CPU time available to the Batch job, but it won't always be so because Snakemake is dealing in the number of processes active at one time within the container and the Batch job's CPU limitation is dealing in cumulative CPU time spent in a period (expressed relative to a single pegged CPU in that same period) across the whole container process tree. So I think oversubscription will mostly be an issue when all the Snakemake processes peg their CPUs at the same time (e.g. alignment, tree building).

Expected behavior

Snakemake is limited to the same number of CPUs that are nominally provisioned by the Batch job definition when --cpus isn't provided.

How to reproduce

Run this Snakefile under various conditions on AWS Batch and observe that workflow.cores reports the number of CPUs of the underlying Batch node if you pass --cores all.

import os

rule:
    shell: f"""
        echo workflow.cores {workflow.cores}
        echo nproc "$(nproc)"
        echo os.sched_getaffinity {len(os.sched_getaffinity(0))}
        echo os.cpu_count {os.cpu_count()}
        env | grep -i threads || true
    """

Possible solution

  1. Explicitly inform Snakemake of the Batch job description's CPU allocation (vcpus), since that's what will apply in the absence of --cpus. Downside is this breaks boundaries between the build command and the Batch runner: the former is what informs Snakemake but the latter is what knows about Batch job definitions.
  2. Require --cpus for the AWS Batch runner. Downside is this is a breaking change and erodes the runtime portability properties of nextstrain build's interface.

Workaround

Always specify --cpus for nextstrain build when using AWS Batch, as this both reserves CPU time for the Batch job and informs Snakemake of the correct --cores number.

Additional context

Realized this was happening when implementing #179.

Solution possibly related to https://github.com/nextstrain/cli/issues/144. It at least will touch the same code.

tsibley avatar May 25 '22 21:05 tsibley

I expect this will cause oversubscription for the actual CPU time available to the Batch job, but it won't always be so because Snakemake is dealing in the number of processes active at one time within the container and the Batch job's CPU limitation is dealing in cumulative CPU time spent in a period…

Relevant for thinking about this mismatch: https://danluu.com/cgroup-throttling/

(Dan Luu's analyses/write-ups are generally great, and I'd seen and read this one before, but it came back up again for me over the weekend, and I recalled that it applied to this issue.)

tsibley avatar Nov 21 '22 19:11 tsibley

Require --cpus for the AWS Batch runner. Downside is this is a breaking change and erodes the runtime portability properties of nextstrain build's interface.

This is my preference

jameshadfield avatar Apr 12 '23 23:04 jameshadfield

Hmm. Even if the other solution:

Explicitly inform Snakemake of the Batch job description's CPU allocation (vcpus), since that's what will apply in the absence of --cpus. Downside is this breaks boundaries between the build command and the Batch runner: the former is what informs Snakemake but the latter is what knows about Batch job definitions.

is possible without breaking boundaries (i.e. by redrawing boundaries)?

I'd be inclined to redraw the boundaries so that we can cleanly use the default from Batch for --cpus.

tsibley avatar Apr 14 '23 20:04 tsibley