bifrost icon indicating copy to clipboard operation
bifrost copied to clipboard

memory problems with CUDA-based rings

Open jaycedowell opened this issue 5 years ago • 2 comments

A couple of times now I have run into problems passing data between blocks using CUDA-based rings. If I don't force a bifrost.device.synchronize_stream() within the reserve context for the ring, I end up with inconsistent results reading from the ring in another block. I think what is happening is that the ring doesn't know about the asynchronous copies and happily marks the reserved segment as good to go when then reserve is released. Is there a better way to deal with this than sprinkling synchronize_stream() calls around?

jaycedowell avatar Apr 01 '20 21:04 jaycedowell

Bifrost asynchronicity is based around CPU threads each having their own CUDA stream. All GPU work in a CPU thread must be synchronous with respect to that thread, so it must be followed by a stream synchronize before things are released to other threads. (Using async CUDA APIs and then synchronizing on a per-CPU-thread stream ensures that GPU work is synchronous within the CPU thread but asynchronous between threads).

E.g., the pipeline infrastructure does this for all blocks here: https://github.com/ledatelescope/bifrost/blob/8a059b3/python/bifrost/pipeline.py#L462

benbarsdell avatar Apr 01 '20 21:04 benbarsdell

Ok, thanks.

jaycedowell avatar Apr 01 '20 21:04 jaycedowell