pymapdl icon indicating copy to clipboard operation
pymapdl copied to clipboard

Supporting SLURM env vars for launching MAPDL configuration

Open germa89 opened this issue 1 year ago • 8 comments

As the title.

The idea is that on SLURM HPC clusters, PyMAPDL will read the SLURM job through the env vars that the SLURM manager creates, so it can launch MAPDL with the appropriate number of cores.

You can see in the code:

options = [
            # 4,  # Fall back option
            SLURM_CPUS_PER_TASK * SLURM_NTASKS,  # (CPUs)
            SLURM_NPROCS,  # (CPUs)
            # SLURM_NTASKS,  # (tasks) Not necessary the number of CPUs,
            # SLURM_NNODES * SLURM_TASKS_PER_NODE * SLURM_CPUS_PER_TASK,  # (CPUs)
            SLURM_CPUS_ON_NODE * SLURM_NNODES,  # (cpus)
        ]
nproc = max(options)

which is the way I use to decide how many cores should MAPDL instance use.

Same with memory:


        if SLURM_MEM_PER_NODE:
            # RAM argument is in MB, so we need to convert
            if SLURM_MEM_PER_NODE[-1] == "T":  # tera
                ram = int(SLURM_MEM_PER_NODE[:-1]) * 10**6
            elif SLURM_MEM_PER_NODE[-1] == "G":  # giga
                ram = int(SLURM_MEM_PER_NODE[:-1]) * 10**3
            elif SLURM_MEM_PER_NODE[-1] == "G":  # mega
                ram = int(SLURM_MEM_PER_NODE[:-1]) * 10**0
            elif SLURM_MEM_PER_NODE[-1].upper() == "k":  # mega
                ram = int(SLURM_MEM_PER_NODE[:-1]) * 10 ** (-3)
            else:
                ram = int(SLURM_MEM_PER_NODE)

I do not use the MAPDL -machines argument to specify the machines used. I don't think it is needed/compatible.

germa89 avatar Feb 08 '24 14:02 germa89

Thanks for opening a Pull Request. If you want to perform a review write a comment saying:

@ansys-reviewer-bot review

ansys-reviewer-bot[bot] avatar Feb 08 '24 14:02 ansys-reviewer-bot[bot]

Considering this https://github.com/koesterlab/setup-slurm-action for testing this PR.

However, sharing the MAPDL installation with the cluster seems a challenge.

GitHub
A github action to setup a small SLURM cluster for testing purposes. - koesterlab/setup-slurm-action

germa89 avatar Feb 28 '24 11:02 germa89

Considering this https://github.com/koesterlab/setup-slurm-action for testing this PR.

However, sharing the MAPDL installation with the cluster seems a challenge.

GitHub**GitHub - koesterlab/setup-slurm-action: A github action to setup a small SLURM cluster for testing purposes.**A github action to setup a small SLURM cluster for testing purposes. - koesterlab/setup-slurm-action

or use the pcluster pipelines to do it?

GitHub
A github action to setup a small SLURM cluster for testing purposes. - koesterlab/setup-slurm-action

sa-cross avatar Mar 01 '24 13:03 sa-cross

Considering this https://github.com/koesterlab/setup-slurm-action for testing this PR. However, sharing the MAPDL installation with the cluster seems a challenge.

GitHub**GitHub - koesterlab/setup-slurm-action: A github action to setup a small SLURM cluster for testing purposes.**A github action to setup a small SLURM cluster for testing purposes. - koesterlab/setup-slurm-action

or use the pcluster pipelines to do it?

GitHub**GitHub - koesterlab/setup-slurm-action: A github action to setup a small SLURM cluster for testing purposes.**A github action to setup a small SLURM cluster for testing purposes. - koesterlab/setup-slurm-action

I'm not sure if I want to have different parts of the testing in different repositories..... I can test from pymapdl repository using the pcluster pipelines??

GitHub
A github action to setup a small SLURM cluster for testing purposes. - koesterlab/setup-slurm-action

germa89 avatar Mar 06 '24 16:03 germa89

cc @greschd . This might be related to your interest in launcher configuration

koubaa avatar Mar 08 '24 13:03 koubaa

Wiz Scan Summary

IaC Misconfigurations 0C 0H 0M 0L 0I
Vulnerabilities 0C 0H 0M 0L 0I
Sensitive Data 0C 0H 0M 0L 0I
Total 0C 0H 0M 0L 0I
Secrets 0🔑

wiz-inc-572fc38784[bot] avatar Mar 14 '24 09:03 wiz-inc-572fc38784[bot]

Codecov Report

Attention: Patch coverage is 75.20000% with 31 lines in your changes missing coverage. Please review.

Project coverage is 86.88%. Comparing base (93dd176) to head (36d33e6). Report is 9 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2754      +/-   ##
==========================================
- Coverage   87.13%   86.88%   -0.25%     
==========================================
  Files          55       55              
  Lines        9816    11070    +1254     
==========================================
+ Hits         8553     9618    +1065     
- Misses       1263     1452     +189     

codecov-commenter avatar Apr 02 '24 10:04 codecov-commenter

This PR should be complemented with https://github.com/ansys-internal/rep-orchestration-interfaces library.

GitHub
GitHub is where people build software. More than 100 million people use GitHub to discover, fork, and contribute to over 420 million projects.

germa89 avatar Jul 03 '24 14:07 germa89

@pyansys-ci-bot LGTM.

germa89 avatar Aug 26 '24 11:08 germa89