Linfeng Zheng

Results 3 comments of Linfeng Zheng

You haven't set the size of CooperativeGroup. The default value is 1. From your program, the producer_group has size of 32 ('if tidx < 32') while the consumer_group has size...

Thanks for pointing this out. Yes, `_nvvm_ops_gen.py` sometimes doesn't contain some ops we would like to use. Truly no {any, all} modes exposed in nvvm ir in current version. Before...

Hi @lucifer1004 , we found that the torch.enisum has precision issue for ada arch. If you use cpu tensors to call torch.enisum, or use fp32 datatype, the program could pass...