pytorch
pytorch copied to clipboard
Further explanation for `batch_isend_irecv`
📚 The doc issue
here the doc string says the function will return dist.Work
by calling corresponding ops, however the returned reqs might not be "corresponding" (e.g. batch_isend_irect([op1, op2])
might return only one [coalescing_req]
), which is quite confusing. (Or can we have a 1-to-1 mapping for p2p_op_list and returned reqs?
Args:
p2p_op_list: A list of point-to-point operations(type of each operator is
``torch.distributed.P2POp``). The order of the isend/irecv in the list
matters and it needs to match with corresponding isend/irecv on the
remote end.
Returns:
A list of distributed request objects returned by calling the corresponding
op in the op_list.
Suggest a potential alternative/fix
No response
cc @mrshenli @pritamdamania87 @zhaojuanmao @satgera @rohan-varma @gqchen @aazzolini @osalpekar @jiayisuse @H-Huang @kwen2501 @awgu @penguinwu @fegin @XilunWu @wanchaol @fduwjj @wz337 @tianyu-l @wconstab @yf225 @chauhang @d4l3k
Thanks, we will improve the document.