dask-jobqueue icon indicating copy to clipboard operation
dask-jobqueue copied to clipboard

Update NERSC Cori to NERSC Perlmutter in docs

Open Andrew-S-Rosen opened this issue 2 years ago • 3 comments

There is a sample configuration for Cori in the docs here. Cori is now retired. It'd probably be worth someone contributing an example for Perlmutter.

Andrew-S-Rosen avatar Jun 30 '23 00:06 Andrew-S-Rosen

Sure! Could you do that @arosen93?

guillaumeeb avatar Jul 02 '23 17:07 guillaumeeb

I'm still sorting it out myself, but once I get it locked in, I will absolutely do so!

Andrew-S-Rosen avatar Jul 02 '23 17:07 Andrew-S-Rosen

@guillaumeeb: I've confirmed the following specs work for Perlmutter. I'm not too familiar with the configuration file YAML format, but it should hopefully be somewhat easy to go from the class kwargs to the configuration parameters!

from dask_jobqueue import SLURMCluster

n_workers = 1 # Number of Slurm jobs to launch in parallel
n_nodes_per_calc = 1 # Number of nodes to reserve for each Slurm job
n_cores_per_node = 48 # Number of CPU cores per node
mem_per_node = "512 GB" # Total memory per node
cluster_kwargs = {
    # Dask worker options
    "cores": n_cores_per_node, # total number of cores (per Slurm job) for Dask worker
    "memory": mem_per_node, # total memory (per Slurm job) for Dask worker
    # SLURM options
    "shebang": "#!/bin/bash",
    "account": "myaccount",
    "walltime": "00:10:00", # DD:HH:SS
    "job_mem": "0", # all memory on node
    "job_script_prologue": ["source ~/.bashrc"], # commands to run before calculation, including exports
    "job_directives_skip": ["-n", "--cpus-per-task"], # Slurm directives we can skip
    "job_extra_directives": [f"-N {n_nodes_per_calc}", "-q debug", "-C cpu"], # num. of nodes for calc (-N), queue (-q), and constraints (-c)
}
cluster = SLURMCluster(**cluster_kwargs)
cluster.scale(n_workers)

Andrew-S-Rosen avatar Jul 29 '23 01:07 Andrew-S-Rosen