Creating an error when a SLURM variable isn't found, usually because …
Raising an environment error when the variable SLURM_LAUNCH_NODE_IPADDR is not found. This is usually because the user isn't running SLURM with srun, and instead using something like sbatch. So, instead of raising a TypeError, this will now raise an EnvironmentError and print some help.
PhysicsNeMo Pull Request
Description
- Line 359 of
physicsnemo.distributed.managernow has atry: ... except TypeError:statement. - Addresses issue #819
Checklist
- [ x] I am familiar with the Contributing Guidelines.
- [ x] New or existing tests cover these changes.
- [ x] The documentation is up to date with these changes.
- [ x] The CHANGELOG.md is up to date with these changes.
- [x ] An issue is linked to this pull request.
Dependencies
N/A
/blossom-ci
FYI this is failing CI due to formatting issues. It's nothing wrong really, black is just picky. I can't edit your branch, can you please apply this diff to the manager.py file?
❯ git diff
diff --git a/physicsnemo/distributed/manager.py b/physicsnemo/distributed/manager.py
index f9cc5be..2eb0c67 100644
--- a/physicsnemo/distributed/manager.py
+++ b/physicsnemo/distributed/manager.py
@@ -359,8 +359,9 @@ class DistributedManager(object):
try:
addr = os.environ.get("SLURM_LAUNCH_NODE_IPADDR")
except TypeError:
- raise EnvironmentError('SLURM variable "SLURM_LAUNCH_NODE_IPADDR" was not detected in the environment. Maybe you need to run with "srun"?')
-
+ raise EnvironmentError(
+ 'SLURM variable "SLURM_LAUNCH_NODE_IPADDR" was not detected in the environment. Maybe you need to run with "srun"?'
+ )
DistributedManager.setup(
rank=rank,
(it's just moving the string to a new line...)
I've done that, sorry for not linting!
/blossom-ci
Closer per #819 - reopen if necessary in the future.