Reza Yazdani
Results
21
issues of
Reza Yazdani
Adding the new version of SoftMax for Transformer kernel to support Triangular mask used in GPT-based models. This addresses https://github.com/microsoft/DeepSpeed/issues/828. TODO: Add a unit test for guarding against this type...
This addresses https://github.com/microsoft/DeepSpeed/issues/1279
This PR addresses https://github.com/microsoft/DeepSpeed/issues/581
This PR adds two set of modification to the unit tests for better test coverage on optimizer functionality: 1. Increase the parameters groups to catch errors better at optimizer step...