Hong-Rong Hsu

Results 13 comments of Hong-Rong Hsu

> Note: our library currently only supports CUDA device, hence the cu121 in the above link. CPU support is not stable due to weak P2P support of Gloo. I ran...

Hi @kwen2501 sorry for the late reply. Our use case is running LLM inference across multiple cpu-based clusters. Could you tell me what is missing in Gloo? How about the...

@kwen2501 Can I open another issue in pytorch github for tracking the CPU Gloo hang?