rmm
rmm copied to clipboard
Remove cuda event deadlocking issues in device mr tests
We fixed both deadlocking issues due to a assumption that std::mutex would have fair scheduling, and work around deadlocks found in cuda event created in very short lived threads ( < 10ms ).
@ajschmidt8 please test on ARM before we merge.
I never tested the problematic code outside of CI, so I have no way of verifying whether this fix works as intended. I'll defer to the devs for the approvals here. If this fix looks good to everyone else, let's get it merged and Ops will add these changes to our GitHub Actions POC PR to see if we still experience any issues.
@gpucibot merge