bosilca

Results 318 comments of bosilca

I'm not sure I understand the comment about marking the dependency at CTL, because a dependency cannot be a `CTL` only a flow can, and that can only be done...

The original associated with these device owned copies should not have a valid dc ?

I don't understand for what reason it stopped working because this entire mechanical was added in for exactly this purpose, helping non-blocking/persistent to release their temporary buggers. I need some...

According to your configure line you build with CUDA from `/usr/local/cuda` while the error message indicates the missing function is from `/lib/x86_64-linux-gnu/libcuda.so`. You might have a mismatch in your `LD_LIBRARY_PATH`....

Try this: ```diff diff --git a/ompi/mpi/c/type_get_envelope_c.c b/ompi/mpi/c/type_get_envelope_c.c index 24229e327c..999174776a 100644 --- a/ompi/mpi/c/type_get_envelope_c.c +++ b/ompi/mpi/c/type_get_envelope_c.c @@ -62,8 +62,13 @@ int MPI_Type_get_envelope_c(MPI_Datatype type, } /* TODO:BIGCOUNT: Need to embiggen ompi_datatype_get_args */ -...

It is correct to assume that you installed the cuda-11.8 you built in /lib64 ? Would it be possible that your LD_LIBRARY_PATH is not correctly propagated to the compute nodes...

NCCL inherits the rank from the MPI process, and in this case both processes think of being rank 0 in an MPI_COMM_WORLD of size 1, as if there were two...

I am not sure I understand the premises of this question. What is two systems sharing the same DRAM ? Are we talking about two virtual hosts or containers running...

I am not familiar with what you describe here. It sounds somewhat familiar with running virtual OSes side-by-side, and the reserved shared memory seems similar to [Inter-VM Shared Memory (IVSHMEM)](https://www.qemu.org/docs/master/system/devices/ivshmem.html)...

I'm not sure we are talking about the same thing. I was not sure about UCC but I'm absolutely certain all OMPI internal allreduce algorithms are using MPI sends and...