bosilca

Results 318 comments of bosilca

I can replicate it by just changing the type (L2 instead of L3). It seems the root cause is an incorrect locality in the proc info (`proc_flags`) in `ompi_comm_split_type_get_part`. I...

Let me take back my earlier comment about my flags not being set correctly. I was testing on OSX, where it is impossible to bind processes to specific resources. Thus,...

All good on Linux, works as expected. Here is how to run it to get as much info as possible from the runtime: `mpirun -x HWLOC_DEBUG_VERBOSE=0 -np 8 --report-bindings ./comm_split`....

I will not pretend I understand why, but according to your output as soon as you start more than 16 processes (on your 32 cores node) your processes are not...

:+1: I have a proposal, let's drop the memcpy functions and use a vector_copy instead, defined as (count, blen, stride). This is the most basic type in the datatype, and...

Fixing the typos is good, but enabling HAN at the node level needs to be backed by some evidence. Why is this necessary ? In which case this leads to...

The first part of the check will only succeed if all processes are local (aka spawned by the same RTE daemon). I don't think that can happen for an INTRA...

In the current code the step 7 will be skipped, and the control is going directly from your step 6 to your step 8. This brings me back to my...

@gpaulsen you can do something like the following to get the assembly code for the function above: `objdump -d libmpi.so | awk -v RS= '/^[[:xdigit:]]+ mca_pml_ob1_recv_req_start/'`

this is indeed the email that triggered my interest in the bug. thanks @marekdam for pointing it out. @jsquyres we need this in main/5.0, and 4.1.