xutizhou issues

Repositories
Issues
Comments

Results 3 issues of


                                            xutizhou

Ada Lovelace support

Hi team, Thank you for your excellent work, I wonder if this repo could support Ada Lovelace architecture such as L20 GPU. Thanks

dispatch low bandwidth at 8 H20

dispatch bandwidth is around 20GB/s while combine bandwidth is near 50GB/s peak.

time out always happens at num_token =128

I have tested node2/node4/node4 normal mode deepep, and always encounter deepep timeout check failed when num_tokens=128. Here is my test code. ```python def test_loop(local_rank: int, num_local_ranks: int, args: argparse.Namespace): num_nodes...