Segfault at thread exit with latest main branch
Background information
What version of Open MPI are you using? (e.g., v3.0.5, v4.0.2, git branch name and hash, etc.)
Main branch commit 4265e248c
Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)
Installed from git clone.
If you are building/installing from a git clone, please copy-n-paste the output from git submodule status.
$ git submodule status
6692c28a4daed5e99443eb724231d7300287fb2c ../3rd-party/openpmix (v1.1.3-3495-g6692c28a)
7ae2c083189db0881d2eff29d71bd507be02bad3 ../3rd-party/prrte (psrvr-v2.0.0rc1-4340-g7ae2c08318)
Please describe the system on which you are running
- Operating system/version: Ubuntu 18.04.5, kernel 4.15.0-142-generic
- Computer hardware:
- Network type:
Details of the problem
I have a use case where I initialize/finalize MPI from a dynamically loaded .so plugin file. Building the plugin with the latest ompi main branch leads to a segfault at pthread exit time. It works fine with 4.1.2 though.
Steps to reproduce:
- create the following files:
test.cplugin.hmpi_plugin.c
$ cat test.c
#include <stdio.h>
#include <unistd.h>
#include <dlfcn.h>
#include <assert.h>
#include <pthread.h>
#include "plugin.h"
static void *plugin_hdl;
int my_init(const char *plugin, handle_t *handle) {
int (*plugin_init)(handle_t *handle);
int status = 0;
plugin_hdl = dlopen(plugin, RTLD_NOW);
assert(plugin_hdl);
plugin_init = dlsym(plugin_hdl, "plugin_init");
status = plugin_init(handle);
assert(!status);
printf("initialized\n");
return status;
}
int my_finalize(handle_t *handle) {
int status = handle->finalize(handle);
assert(!status);
dlclose(plugin_hdl);
printf("finalized\n");
return 0;
}
void *t_func(void *arg)
{
printf("thread started\n");
sleep(1);
printf("thread exiting\n");
return NULL;
}
int main(int argc, char **argv)
{
handle_t handle;
int rc;
pthread_t t1;
rc = my_init("test_plugin.so", &handle),
assert(!rc);
rc = my_finalize(&handle);
assert(!rc);
rc = pthread_create(&t1, NULL, t_func, NULL);
assert(!rc);
pthread_join(t1, NULL);
return 0;
}
$ cat plugin.h
#ifndef PLUGIN_H
#define PLUGIN_H
typedef struct handle {
int pg_rank;
int pg_size;
int (*finalize)(struct handle *handle);
} handle_t;
int plugin_init(handle_t *handle);
#endif
$ cat mpi_plugin.c
#include <assert.h>
#include <mpi.h>
#include "plugin.h"
static int mpi_finalize(handle_t *handle) {
int rc = MPI_Finalize();
assert(rc == MPI_SUCCESS);
return 0;
}
int plugin_init(handle_t *handle) {
int rc = MPI_Init(NULL, NULL);
assert(rc == MPI_SUCCESS);
handle->finalize = mpi_finalize;
return 0;
}
mpicc -shared -fPIC -o test_plugin.so mpi_plugin.cgcc test.c -ldl -pthreadmpirun -n 1 ./a.out
@streichler FYI
I am unable to reproduce. Are you using an external HWLOC installation perchance?
Can you share a backtrace of the segv (compiling with -g).
I build with --with-hwloc=internal and I tested on another system and got the same segfault. I'm on commit 49460b41
This is the backtrace I get:
#0 0x00007ffff7f87580 in ?? ()
#1 0x00007ffff79b78b9 in advise_stack_range (guardsize=<optimized out>, pd=140737061897984, size=<optimized out>, mem=0x7fffe614c000) at allocatestack.c:386
#2 start_thread (arg=0x7fffe694c700) at pthread_create.c:552
#3 0x00007ffff76e071f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
Thanks. Can you share your config.log? I am unable to reproduce, neither on a RHEL box nor ubuntu v18.04.
It looks like this issue is expecting a response, but hasn't gotten one yet. If there are no responses in the next 2 weeks, we'll assume that the issue has been abandoned and will close it.
Per the above comment, it has been a month with no reply on this issue. It looks like this issue has been abandoned.
I'm going to close this issue. If I'm wrong and this issue is not abandoned, please feel free to re-open it. Thank you!