David--Cléris Timothée

Results 26 comments of David--Cléris Timothée

It seems indeed related to #12849. Basically pointer used for communication between two GPU gets registered for IPC, and the IPC handle is never released which prevents the memory from...

Any update on this issue ? This does affect production quite significantly ....

I agree that shouldn’t be an issue in principle, however when I check with nvidia smi the actual memory usage is growing by similar amount to the active cuIpc handles,...

I managed to get somewhat of a reproducer (in sycl though but it is transparent to cuda). Here is the end of the output, clearly the programm memory is unchanged,...

Any update on this issue ? This does affect production quite significantly ....

have you changed also the read and write part of the scan ? The issue was mostly that the scan does not read and write at the same slot for...

I just tried with ```bash #!/bin/bash -l #PBS -A Shamrock #PBS -N scale_256_hybrid #PBS -l walltime=0:15:00 #PBS -l select=256 #PBS -l place=scatter #PBS -l filesystems=home:flare #PBS -q prod #PBS -k...

Hi, I haven't been able to continue on this those last months. I will try to make a reproducer to ease the tracking

Hi, I thinks i'm encountering this exact issue currently on a workstation. Basically using MPI communications on CUDA allocated memory result in memory leaks. What is the current status of...

> .... > I'm able to reproduce the issue. cuda-ipc transport in UCX caches peer mappings and a free call of peer mapped memory is not guaranteed to release memory....