Yonghao Zhuang
Results
41
comments of
Yonghao Zhuang
It looks like you are using the `process_group` for context parallel, while it was designed for the tensor and sequence parallel. As long as TP and CP are compatible, I...