MS-AMP icon indicating copy to clipboard operation
MS-AMP copied to clipboard

Microsoft Automatic Mixed Precision Library

Results 31 MS-AMP issues
Sort by recently updated
recently updated
newest added

@tocean @wkcn In line with the investigation in https://github.com/NVIDIA/TransformerEngine/issues/424, it would be great to get the insights from the team at microsoft for using FP8 in aspects of training besides...

**What would you like to be added**: Tune scaling factor automatically for fp8 collective communication. **Why is this needed**: Reduce the scaling factor to min value across all GPUs may...

**What would you like to be added**: Moving extension installation from post install to setup.py under project root folder. **Why is this needed**: Extensions are part of MS-AMP and should...

**What's the issue, what's expected?**: Error when using ms-amp to do llm sft. ms-amp deepspeed config: "msamp": { "enabled": true, "opt_level": "O1|O2|O3", # all tried "use_te": false } **How to...

**What would you like to be added**: Integrate MS-AMP with PyTorch Lightning **Why is this needed**: MS-AMP shows huge gains in throughput when training in FP8. That's very exciting. Adoption...

Hello , I hope this message finds you well. I am a user of your msamp and have found it to be incredibly useful in my work with large-scale language...

**Description** Avoid running workflow on self-hosted node, including 1. switch image build to github runners. 2. remove the UT workflow.

**What's the issue, what's expected?**: The compilation stuck at `./MS-AMP/third_party/msccl/build/obj/collectives/device/msccl_kernel.o` **How to reproduce it?**: `do the steps from the doc` **Log message or shapshot?**: ``` Compiling msccl_kernel.cu > .../MS-AMP/third_party/msccl/build/obj/collectives/device/msccl_kernel.o ```...

**Description** This PR makes it easier for users to use FSDP with MS-AMP from their existing optimizers. This is especially beneficial for library authors, as currently we need to go...

**What's the issue, what's expected?**: There are attributes inside of regular `deepspeed.runtime` that are missing in this repo, and the monkey-patch doesn't cover, such as: ```python from deepspeed.runtime.lr_schedules import VALID_LR_SCHEDULES...