duanjunwen comments

Repositories
Issues
Comments

Results 3 comments of


                                            duanjunwen

[feat] add DistributedAdafactor;

1. add DistributedAdafactor to ”./colossalai/nn/optimizer/distributed_adafactor.py“; Support for parameter input formats： RowParallel + Zero2, ColParallel + Zero2; 2. add TestCase to "./tests/test_optimizer/test_distributred_adafactor_optim.py"

[Fix] Update MoeHybridParallelPlugin;

1.Update MoeHybridParallelPlugin; 2.Use MoeHybridParallelPlugin to replace MoEManager; 3.Remove dependency on MoEManager of test_moe_checkpoint.py;

[BUG]: No module named 'dropout_layer_norm'

Hi @apachemycat , would you mind sharing the version of flash_atten in your environment? I am using flash-attn==2.5.7 , looks all good. Also, you can replace dropout_layer_norm with torch.nn.functional.layer_norm &...