tutel icon indicating copy to clipboard operation
tutel copied to clipboard

[Question] Why use datatype ncclInt8 in nccl_all_to_all_scatter_async.

Open cicirori opened this issue 1 year ago • 1 comments

Wondering why the ncclint8 datatype is used in the C++ implementation of nccl_all_to_all_scatter_async, whether it's for speed reasons or simply because don't want to support multiple datatypes through templates.

Thanks!

cicirori avatar Dec 18 '23 11:12 cicirori

According to bandwidth profiling, there is no speed difference between ncclInt8 x N and ncclInt32 x N / 4, so you can choose either.

ghostplant avatar Jun 06 '24 11:06 ghostplant