The simple hello-world.c MPI program prints: shmem: mmap: an error occurred while determining whether or not /tmp/ompi.yv.1001/jf.0/3074883584/sm_segment.yv.1001.b7470000.0 could be created
See the program below.
$ ./hello-world-1
[xx.xx.xx:12584] shmem: mmap: an error occurred while determining whether or not /tmp/ompi.yv.1001/jf.0/3074883584/sm_segment.yv.1001.b7470000.0 could be created.
---program---
$ cat hello-world-1.c
#include <mpi.h>
#include <stdio.h>
int main(int argc, char** argv) {
// Initialize the MPI environment
MPI_Init(NULL, NULL);
// Get the number of processes
int world_size;
MPI_Comm_size(MPI_COMM_WORLD, &world_size);
// Get the rank of the process
int world_rank;
MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
// Get the name of the processor
char processor_name[MPI_MAX_PROCESSOR_NAME];
int name_len;
MPI_Get_processor_name(processor_name, &name_len);
// Print off a hello world message
printf("> Hello world from processor %s, rank %d out of %d processors (pid=%d)\n",
processor_name, world_rank, world_size, getpid());
sleep(1);
printf("< Hello world from processor %s, rank %d out of %d processors (pid=%d)\n",
processor_name, world_rank, world_size, getpid());
// Finalize the MPI environment.
MPI_Finalize();
}
Version: openmpi-5.0.5_1 Describe how Open MPI was installed: FreeBSD package Computer hardware: Intel CPU Network type: Ethernet/IP (irrelevant) Available space in /tmp: 64GB FreeBSD 14.1
Please provide all the information from the debug issue template; thanks!
https://github.com/open-mpi/ompi/blob/main/.github/ISSUE_TEMPLATE/bug_report.md
I added missing bits of information.
the root cause could be not enough available space in /tmp (unlikely per your description) or something went wrong when checking the size.
try running
env OMPI_MCA_shmem_base_verbose=100 ./hello-world-1
and check the output (useful message might have been compiled out though)
if there is nothing useful, you can
strace -o hw.strace -s 512 ./hello-world-1
then compress hw.strace and upload it.
env OMPI_MCA_shmem_base_verbose=100 ./hello-world-1
This didn't produce anything relevant.
strace -o hw.strace -s 512 ./hello-world-1
BSDs have ktrace instead. Here is the ktrace dump: https://freebsd.org/~yuri/openmpi-kernel-dump.txt
51253 hello-world-1 CALL fstatat(AT_FDCWD,0x1b0135402080,0x4c316d20,0)
51253 hello-world-1 NAMI "/tmp/ompi.yv.0/jf.0/2909405184"
51253 hello-world-1 RET fstatat -1 errno 2 No such file or directory
51253 hello-world-1 CALL open(0x1b0135402080,0x120004<O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC>)
51253 hello-world-1 NAMI "/tmp/ompi.yv.0/jf.0/2909405184"
51253 hello-world-1 RET open -1 errno 2 No such file or directory
It looks like some directories were not created.
what if you mpirun -np 1 ./hello-world-1 instead?
sudo mpirun -np 1 ./hello-world-1 prints the same error message:
It appears as if there is not enough space for /dev/shm/sm_segment.yv.0.9f060000.0 (the shared-memory backing
file). It is likely that your MPI job will now either abort or experience
performance degradation.
The log doesn't have any mkdir operations, so that "/tmp/ompi.yv.0" was never created.
well, this is a different message that the one used when opening this issue. And this one is self explanatory.
Anyway, what if you
env OMPI_MCA_shmem_mmap_backing_file_base_dir=/tmp ./helloworld-1
or you can simply increase the size of /dev/shm
sudo OMPI_MCA_shmem_mmap_backing_file_base_dir=/tmp ./hello-world-1 produces the same error messages.
This message is for a regular user:
$ OMPI_MCA_shmem_mmap_backing_file_base_dir=/tmp ./hello-world-1
[yv.noip.me:88431] shmem: mmap: an error occurred while determining whether or not /tmp/ompi.yv.1001/jf.0/1653407744/sm_segment.yv.1001.628d0000.0 could be created.
> Hello world from processor yv.noip.me, rank 0 out of 1 processors (pid=88431)
< Hello world from processor yv.noip.me, rank 0 out of 1 processors (pid=88431)
This message is for root:
# OMPI_MCA_shmem_mmap_backing_file_base_dir=/tmp ./hello-world-1
--------------------------------------------------------------------------
It appears as if there is not enough space for /dev/shm/sm_segment.yv.0.ee540000.0 (the shared-memory backing
file). It is likely that your MPI job will now either abort or experience
performance degradation.
Local host: yv
Space Requested: 16777216 B
Space Available: 1024 B
--------------------------------------------------------------------------
> Hello world from processor yv.noip.me, rank 0 out of 1 processors (pid=88929)
< Hello world from processor yv.noip.me, rank 0 out of 1 processors (pid=88929)
I see.
try adding OMPI_MCA_btl_sm_backing_directory=/tmp and see how it works
The error messages disappear when OMPI_MCA_btl_sm_backing_directory=/tmp is used.
We have seen and responded to this problem many times - I believe it is included in the docs somewhere. The problem is that BSD (mostly as seen on Mac) has created a default TMPDIR that is incredibly long. So when we add our tmpdir prefix (to avoid stepping on other people's tmp), the result is longer than the path length limits.
Solution: set TMPDIR in your environment to point to some shorter path, typically something like $HOME/tmp.
[...] a default TMPDIR that is incredibly long [...]
What do you mean by TMPDIR? In our case TMPDIR is just /tmp.
Indeed, it seems the root cause is something fishy related to /dev/shm
what if you
df -h /dev/shm
both as a user and root?
$ df -h /dev/shm
Filesystem Size Used Avail Capacity Mounted on
devfs 1.0K 0B 1.0K 0% /dev
# df -h /dev/shm
Filesystem Size Used Avail Capacity Mounted on
devfs 1.0K 0B 1.0K 0% /dev
That's indeed a small /dev/shm.
I still do not understand why running as a user does not get you the user friendly message you get when running as root.
can you ktrace as a non-root user so we can figure out where the failure occurs?
Here is the ktrace dump for a regular user.
It seems regular users do not have write access to the (small size) /dev/shm and we do not display a friendly error message about it.
45163 hello-world-1 CALL access(0x4e3d8d33,0x2<W_OK>)
45163 hello-world-1 NAMI "/dev/shm"
45163 hello-world-1 RET access -1 errno 13 Permission denied
Unless you change that, your best bet is probably to add
btl_sm_backing_directory=/tmp
to your $PREFIX/etc/openmpi-mca-params.conf
Is direct access to /dev/shm new in OpenMPI? It used to work fine on FreeBSD.
How does this work on Linux? Is everybody allowed write access to /dev/shm there?
Access to /dev/shm has fallback in ompi, like here.
Why doesn't this fallback work then? Is it accidentally missing in some cases?
I believe I've tried everything suggested (and then some) as evidenced by the following interactions:
(ioniser) jabowery@jaboweryML:~/devel/ioniser$ printenv |grep BulkData|grep tmp
OMPI_MCA_shmem_mmap_backing_file_base_dir=/mnt/BulkData/home/jabowery/tmp
btl_sm_backing_directory=/mnt/BulkData/home/jabowery/tmp
TMPDIR=/mnt/BulkData/home/jabowery/tmp
(ioniser) jabowery@jaboweryML:~/devel/ioniser$ tail /home/jabowery/mambaforge/envs/ioniser/etc/openmpi-mca-params.conf
# See "ompi_info --param all all --level 9" for a full listing of Open
# MPI MCA parameters available and their default values.
pml = ^ucx
osc = ^ucx
coll_ucc_enable = 0
mca_base_component_show_load_errors = 0
opal_warn_on_missing_libcuda = 0
opal_cuda_support = 0
btl_sm_backing_directory=/mnt/BulkData/home/jabowery/tmp
(ioniser) jabowery@jaboweryML:~/devel/ioniser$ tail /etc/openmpi/openmpi-mca-params.conf
btl_base_warn_component_unused=0
# Avoid openib an in case applications use fork: see https://github.com/ofiwg/libfabric/issues/6332
# If you wish to use openib and know your application is safe, remove the following:
# Similarly for UCX: https://github.com/open-mpi/ompi/issues/8367
mtl = ^ofi
btl = ^uct,openib,ofi
pml = ^ucx
osc = ^ucx,pt2pt
btl_sm_backing_directory=/mnt/BulkData/home/jabowery/tmp
(ioniser) jabowery@jaboweryML:~/devel/ioniser$ !p
p ioniser.py
[jaboweryML:34571] shmem: mmap: an error occurred while determining whether or not /mnt/BulkData/home/jabowery/tmp/ompi.jaboweryML.1000/jf.0/121765888/shared_mem_cuda_pool.jaboweryML could be created.
[jaboweryML:34571] create_and_attach: unable to create shared memory BTL coordinating structure :: size 134217728
(ioniser) jabowery@jaboweryML:~/devel/ioniser$ whoami
jabowery
ioniser) jabowery@jaboweryML:~/devel/ioniser$ touch /mnt/BulkData/home/jabowery/tmp/accesstest.txt
(ioniser) jabowery@jaboweryML:~/devel/ioniser$ ls -altr /mnt/BulkData/home/jabowery/tmp/accesstest.txt
-rw-rw-r-- 1 jabowery jabowery 0 Nov 1 10:51 /mnt/BulkData/home/jabowery/tmp/accesstest.txt
(ioniser) jabowery@jaboweryML:~/devel/ioniser$ df /mnt/BulkData/home/jabowery/tmp
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/nvme1n1 1921725720 692366840 1131666768 38% /mnt/BulkData
(ioniser) jabowery@jaboweryML:~/devel/ioniser$
When I compile an run this test program on Arch, I, too, get the error message. When debugging, why the error occurs, I found the following:
- A (randomly names) file in
/dev/shmis (sucessfully) created. - A backing file in
/tmpis about to be created, which is failing and printing the error message
There seems something weird to be going on in shmem_mmap_module.c:
When creating the in-memory-file under /dev/shm, it is locaed directly under /dev/shm (f.e.
/dev/shm/sm_segment.kohni-mobil.1000.110f0000.0). enough_space() strips the file-name off the path and checks whether there is enough space in memory, which is fine. After the check the file is created via open() in line 347.
When creating the backing-file in /tmp, it is located in a sub-directory-structure
(f.e. /tmp/ompi.kohni-mobil.1000/jf.0/286195712/shared_mem_cuda_pool.kohni-mobil). enough_space() is used again the check, whether there is enough space. But since the function only strips the file name, and the directory structur is not yet created, opal_path_df() in path.c, line 683 fails in the call to statfs(). So, in the end the backing file can never be created.
I guess, either the directory-structure needs to be created before the size check, or the size check needs to determine the base mount point instead of the (sub) directory to check the available size, or the backing file must not be created in a sub-directory...
Best, Jan
Something is off here. This directory name (ompi.kohni-mobil.1000) is something that would be produced by an mpirun for OMPI v4 or earlier - it most definitely is not the name of the top-level directory used by OMPI v5 (which would look like prterun.kohni-mobil.3102.501). Looks like you are attempting to launch an app compiled against OMPI v5 using an earlier mpirun? That will not work.
Hm,
currently it looks like this:
jankoh@kohni-mobil untitled $ ldd cmake-build-debug/untitled | grep mpi
libmpi.so.40 => /usr/lib/libmpi.so.40 (0x00007fc059e00000)
jankoh@kohni-mobil untitled $ pacman -Qo /usr/lib/libmpi.so.40
/usr/lib/libmpi.so.40 ist in openmpi 5.0.6-2 enthalten
jankoh@kohni-mobil untitled $ mpirun --help
mpirun (Open MPI) 5.0.6
Usage: mpirun [OPTION]...
See the mpirun(1) man page or HTML help for a detailed list of command
line options that are available.
Report bugs to https://www.open-mpi.org/community/help/
jankoh@kohni-mobil untitled $
I find it a bit strange, that the shared lib is named *40. The cmake-rules for the project are quite simple, too:
find_package(MPI QUIET REQUIRED)
add_executable(untitled main.cpp)
target_link_libraries(untitled PUBLIC MPI::MPI_CXX)
Edit: when I run the program via mpirun, the error vanishes...
The ".40" is just from a libtool convention - the number has nothing to do with the OMPI version itself.
I missed that this is happening only when executed as a singleton. Quick glance at the code shows that OMPI is missing a couple of lines - trivial fix.
Give this a try:
diff --git a/ompi/runtime/ompi_rte.c b/ompi/runtime/ompi_rte.c
index 2a2d66bbc3..2ba9483c98 100644
--- a/ompi/runtime/ompi_rte.c
+++ b/ompi/runtime/ompi_rte.c
@@ -69,6 +69,7 @@ opal_process_name_t pmix_name_invalid = {UINT32_MAX, UINT32_MAX};
* session directory structure, then we shall cleanup after ourselves.
*/
static bool destroy_job_session_dir = false;
+static bool destroy_proc_session_dir = false;
static int _setup_top_session_dir(char **sdir);
static int _setup_job_session_dir(char **sdir);
@@ -995,9 +996,12 @@ int ompi_rte_finalize(void)
opal_process_info.top_session_dir = NULL;
}
- if (NULL != opal_process_info.proc_session_dir) {
+ if (NULL != opal_process_info.proc_session_dir && destroy_proc_session_dir) {
+ opal_os_dirpath_destroy(opal_process_info.proc_session_dir,
+ false, check_file);
free(opal_process_info.proc_session_dir);
opal_process_info.proc_session_dir = NULL;
+ destroy_proc_session_dir = false;
}
if (NULL != opal_process_info.app_sizes) {
@@ -1174,6 +1178,7 @@ static int _setup_top_session_dir(char **sdir)
static int _setup_job_session_dir(char **sdir)
{
+ int rc;
/* get the effective uid */
uid_t uid = geteuid();
@@ -1185,18 +1190,33 @@ static int _setup_job_session_dir(char **sdir)
opal_process_info.job_session_dir = NULL;
return OPAL_ERR_OUT_OF_RESOURCE;
}
+ rc = opal_os_dirpath_create(opal_process_info.job_session_dir, 0755);
+ if (OPAL_SUCCESS != rc) {
+ // could not create session dir
+ free(opal_process_info.job_session_dir);
+ opal_process_info.job_session_dir = NULL;
+ return rc;
+ }
destroy_job_session_dir = true;
return OPAL_SUCCESS;
}
static int _setup_proc_session_dir(char **sdir)
{
+ int rc;
+
if (0 > opal_asprintf(sdir, "%s/%d",
opal_process_info.job_session_dir,
opal_process_info.my_name.vpid)) {
opal_process_info.proc_session_dir = NULL;
return OPAL_ERR_OUT_OF_RESOURCE;
}
-
+ rc = opal_os_dirpath_create(opal_process_info.proc_session_dir, 0755);
+ if (OPAL_SUCCESS != rc) {
+ // could not create session dir
+ free(opal_process_info.proc_session_dir);
+ opal_process_info.proc_session_dir = NULL;
+ return rc;
+ }
return OPAL_SUCCESS;
}
@rhc54 Having applied your patch to openmpi and using the patched openmpi in my small version of this test program, I can confirm, the patch works, the error message vanishes.
Thanks a lot, and best regards, Jan
Hey @rhc54 -- do we need this as a PR?
If so, where does destroy_proc_session_dir get set to true?
EDIT: Never mind -- I see https://github.com/open-mpi/ompi/pull/13003 😄
Hey @rhc54 -- do we need this as a PR?
I don't need it - but you guys do 😄
We have seen and responded to this problem many times - I believe it is included in the docs somewhere. The problem is that BSD (mostly as seen on Mac) has created a default
TMPDIRthat is incredibly long. So when we add our tmpdir prefix (to avoid stepping on other people's tmp), the result is longer than the path length limits.Solution: set
TMPDIRin your environment to point to some shorter path, typically something like$HOME/tmp.
This issue is the first result that turns up on Google (I hit the same error with [email protected] %[email protected] arch=darwin-sequoia-m2) ; a search of the openmpi site returned no hits and the internal doc search isn't smart enough to look for multiple keywords. Should I figure out how to open a PR for the docs, or was this fixed in 5.0.6 (see #13003)? Thanks!
(Also, I guess this issue can officially be closed by #13003 ?)
I didn't make any changes to the docs, so if there is something missing there, you are welcome to fill the void!
Perhaps @jsquyres can provide you with some direction on how to open the PR?
And yes - we can officially close this now.