ucx icon indicating copy to clipboard operation
ucx copied to clipboard

Unified Communication X (mailing list - https://elist.ornl.gov/mailman/listinfo/ucx-group)

Results 413 ucx issues
Sort by recently updated
recently updated
newest added

We often (but not always) see errors like: ``` mlx5dv_devx_create_event_channel() failed: Protocol not supported ``` Which results in `Input/output error`, and our test application failing. We are using UCX 1.10.1...

Bug

### Describe the bug For UCX 1.10.1, compiled on CentOS7.9 against RDMA-core 33.1, OpenMPI (versions 3.1, and 4.1) gives the following error: rc_mlx5_devx.c:99 UCX ERROR mlx5dv_devx_create_event_channel() failed: Protocol not supported...

Bug

Hi, I'm encountering an XPMEM-related bus error crash during the finalization of MPI applications. I can reliably reproduce it with 16+ ranks. Versions: ``` Open MPI v5.0.0rc6 UCX v1.12.1 ```...

Bug

Trying to run a UCX based Open MPI with each process in a user namespace (container) breaks UCX completely it seems: ``` mm_posix.c:445 UCX ERROR Error returned from open in...

## What Fix connection matching in UD transport. ## Why ? If user destroyed UCT EP, we shouldn't disconenct it, because it could be used for RX operations. ## How...

### Describe the bug When compiling ucx with NVHPC compilers v22.3, and using mlx5_0:1, ucx_perftest hangs, and when `^C`'ing, the server triggers an assertion. The exact same setup with GCC...

Bug

## What Enforce types on enums, to the extent possible by the C standard and the compilers we use. ## Why ? Prevent human errors caused by incorrect usage of...

WIP-DNM

## What Make sure DCT closed prior doing `uct_ep_check`. ## Why ? Fixes #6194. It seems `uct_ep_check` couldn't detect an error right after DCT was closed. ## How ? Add...

``` 2021-01-25T08:17:05.7909604Z [----------] 1 test from dc_mlx5/test_uct_peer_failure_keepalive 2021-01-25T08:17:05.7910538Z [ RUN ] dc_mlx5/test_uct_peer_failure_keepalive.killed/0 2021-01-25T08:17:05.7911426Z [ INFO ] Testing component: ib 2021-01-25T08:17:15.9722092Z /scrap/azure/agent-07/AZP_WORKSPACE/1/s/contrib/../test/gtest/uct/test_peer_failure.cc:521: Failure 2021-01-25T08:17:15.9723788Z Value of: m_err_count 2021-01-25T08:17:15.9724541Z Actual: 0 2021-01-25T08:17:15.9725139Z...

Bug
CI (AZP/Jenkins)

Is there a reason why UCX does not support IBV_WR_RDMA_WRITE_WITH_IMM? Greping the source code did not find any hits.

Change request