How do multiple GPUs communicate with each other
Hello, I would like to know how to explicitly observe the communication process between multiple GPUs and how they exchange memory information. I noticed that the Distribution function can map physical memory to different GPUs. Currently, my research focuses on GPU interconnect communication, so I would like to seek your advice on this. Thank you!
I would not say the distribution is about GPU-GPU communication but about how the memory is allocated to GPUs.
May I know what you want to understand about GPU-GPU communication? Two points you can try to examine. One is the RDMA engine, which performs cache-line level memory access across GPUs. https://github.com/sarchlab/mgpusim/tree/v3/timing/rdma. The second is the Endpoint, which is a network component that gathers all the outgoing/incoming communication of a device. https://github.com/sarchlab/akita/blob/v3/noc/networking/switching/endpoint.go
I would not say the distribution is about GPU-GPU communication but about how the memory is allocated to GPUs.
May I know what you want to understand about GPU-GPU communication? Two points you can try to examine. One is the RDMA engine, which performs cache-line level memory access across GPUs. https://github.com/sarchlab/mgpusim/tree/v3/timing/rdma. The second is the Endpoint, which is a network component that gathers all the outgoing/incoming communication of a device. https://github.com/sarchlab/akita/blob/v3/noc/networking/switching/endpoint.go
Thank you for your reply!I have known where my problem is.