poseidon icon indicating copy to clipboard operation
poseidon copied to clipboard

GARD process fails if temp dir path for MPI is too long

Open fischer-hub opened this issue 3 years ago • 3 comments

Just ran into this issue while running with singularity on an HPC:

PMIx has detected a temporary directory name that results
in a path that is too long for the Unix domain socket:

    Temp dir: /data/scratch2/fisched99/poseidon_wd/77/dbac715a7aa899f39937e7064d2a33/openmpi-sessions-326421@cmp216_0/46869

Try setting your TMPDIR environmental variable to point to
something shorter in length

Currently TMPDIR is set to ${PWD}, maybe we can change it to point to /tmp or something else to shorten the path. Although this probably doesn't happen to often, especially in local runs.

fischer-hub avatar Jan 11 '22 13:01 fischer-hub

I am not sure if there was a reason I set this to $PWD... could be that an absolute path is needed but then you are also right, that smt like /tmp would also work.

hoelzer avatar Jan 11 '22 20:01 hoelzer

Yeah, same issue - it's in unix and not poseidon, so I think you have to run in /tmp

mchaisso avatar Jan 15 '22 18:01 mchaisso

Yeah, same issue - it's in unix and not poseidon, so I think you have to run in /tmp

You can try to run on branch gard_hpc_issues, this should fix the GARD_detect issues on HPC.

fischer-hub avatar Jan 17 '22 11:01 fischer-hub