Jerry Liu
Jerry Liu
The CI errors are not related with this PR: 1. [go gpu](https://dev.azure.com/ucfconsort/0b36e3f0-8ab9-4a48-b68b-4b2350e02c88/_apis/build/builds/53156/logs/334) ``` 2022-10-25T15:51:21.6848238Z ##[warning]Module dev/go-latest cannot be loaded 2022-10-25T15:51:21.6863122Z ##[error]Bash exited with code '1'. ``` 2. [Tests gpu on...
@yosefe Let me know if there're other places need to be changed. If it's needed, this PR should be tested in large scale to check whether there're obvious performance improvement.
@yosefe What's the next step?
In one process. 1. thread A ``` cudaMalloc(&buffer, 100); ``` 2. thread B ``` void *base_address = (void*)buffer; size_t alloc_length = 100; cuMemGetAddressRange((CUdeviceptr*)&base_address, &alloc_length, (CUdeviceptr)buffer); ``` Is it right to...
It reports error in below code: ``` #include #include #include #include #include #include /* * $ gcc -DEXTRA x.c -lcuda -lcudart -lpthread -o mytest * $ ./mytest * cuda *...
To avoid misunderstanding, I add info here: Without this PR, DC could work as expected when using lag hash mode. For lag hash mode, if the QP is set affinity...
According to below failure logs, the failed test cases are not related with this PR 1. [UCX PR(jucx gpu java8)](https://dev.azure.com/ucfconsort/ucx/_build/results?buildId=55069&view=logs&j=2d17f13d-f402-5448-dd96-eeac985c8863&t=61d25ca5-2380-522c-68e5-91463ebbd3bf&l=461) ``` 2022-12-01T08:54:52.3549594Z [1669884892.354152] [swx-rdmz-ucx-gpu-02:20922:0] memtrack.c:328 UCX WARN allocated zero-size block...
@yosefe It has passed CI after squashing the commits.
Confirm: All error logs are not related with this PR. It seems very strange that there're 11 CI test case failure that are not related with this PR.
bot:retest