Marcin Krotkiewski

Results 12 issues of Marcin Krotkiewski

Apologies for a long report, but I don't know yet how to group this into smaller issues. I guess this is mostly a question to the Mellanox/NVIDIA folks here. We...

I am testing HPCX 2.10 (UCX 1.12, OpenMPI 4.1.2rc4) on a 2-socket EPYC 7742 system using `osu_bibw` benchmark. I test the in-node bandwidth: both ranks are started on the same...

Bug

I'm using xpmem in our home-brew application (OpenMPI + our own xpmem for in-node comm), on an AMD EPYC cluster, 7.7 (Maipo), kernel 3.10.0-1062.9.1.el7.x86_64. Sometimes after the applications finishes, multiple...

I am looking at OpenMPI 4.0.3 and HDF5 1.10.6 compiled against it. A user reported segfault in `ADIOI_Flatten()` when using a chunked dataset, i.e., when the following line is executed:...

Target: v5.0.x

I am seeing bad performance with in-node cross-GPU data exchanges on nodes with two H100 cards on PCI-E. First, I have seen #9287. Using the master branch instead of 1.15.0...

Bug

I am using SuperLU to compute and apply an ILU-based preconditioner. I compute the factors (and solve the system) using `zgsisx`. Since SuperLU does not have GPU support, I am...

This is mostly meant as a discussion, as I don't know if this functionality is possible / simple to implement. Currently the generated Fortran bindings have generic argument names `arg`:...

Currently, when returning a complex type to Fortran `cpp_bindgen` always returns `bindgen_handle *` as `type(c_ptr)` in Fortran. As a result, the actual type of the Fortran object is unknown and...

In GHEX we need to pass a callback function from Fortran to the C++ library. The Fortran-C interface allows us to do this with the `type(c_funptr)` type, e.g.: ``` [...]...

I am implementing a communication pattern where GPUs exchange parts of their local data vector. The exchanged vector entries are 'unstructured' (arbitrary indices) with block size of ~8KB: for each...

question