ucx icon indicating copy to clipboard operation
ucx copied to clipboard

GTEST/UD: Increase UD EP timeout when running under valgrind - v1.17.x

Open iyastreb opened this issue 1 year ago • 0 comments

This is double commit of https://github.com/openucx/ucx/pull/9880, into v1.17.x branch

Fix for RM#3918537

I managed to reproduce this issue on rock machines in 100% of the cases, but only when running this test under high CPU load. This CPU load I generate using dummy 64 processes (yes > /dev/null). I checked ud_ep timeout logic, and it seems to work correctly. So the reasonable fix would be to increase UCX_UD_TIMEOUT (from 30s to 300s) when running under valgrind. With increased timeout the issue is not reproducible anymore, even with artificial CPU load

iyastreb avatar May 30 '24 06:05 iyastreb