pytorch icon indicating copy to clipboard operation
pytorch copied to clipboard

Further explanation for `batch_isend_irecv`

Open botbw opened this issue 10 months ago • 1 comments

📚 The doc issue

here the doc string says the function will return dist.Work by calling corresponding ops, however the returned reqs might not be "corresponding" (e.g. batch_isend_irect([op1, op2]) might return only one [coalescing_req]), which is quite confusing. (Or can we have a 1-to-1 mapping for p2p_op_list and returned reqs?

Args:
    p2p_op_list: A list of point-to-point operations(type of each operator is
        ``torch.distributed.P2POp``). The order of the isend/irecv in the list
        matters and it needs to match with corresponding isend/irecv on the
        remote end.

Returns:
    A list of distributed request objects returned by calling the corresponding
    op in the op_list.

Suggest a potential alternative/fix

No response

cc @mrshenli @pritamdamania87 @zhaojuanmao @satgera @rohan-varma @gqchen @aazzolini @osalpekar @jiayisuse @H-Huang @kwen2501 @awgu @penguinwu @fegin @XilunWu @wanchaol @fduwjj @wz337 @tianyu-l @wconstab @yf225 @chauhang @d4l3k

botbw avatar Apr 27 '24 07:04 botbw

Thanks, we will improve the document.

kwen2501 avatar May 01 '24 05:05 kwen2501