DeepSpeed
DeepSpeed copied to clipboard
FastGen H100 MoE support: Add PyTorch multi-gemm MOE implementation