ghostplant

Results 272 comments of ghostplant

Hi, the standard GShard MoE follows the branch `self.is_gshard_loss == True`, while the loss option you pointed out is designed and preferred by Swin-Transformer MoE. According to `load_importance_loss` defined in...

> I recall something about `SM80_16x8x16_F16F16F16F16_TN` being significantly more difficult to optimize than `SM80_16x8x16_F32F16F16F32_TN`, though I forget the details as it's been a while since deep work on Ampere. >...

> I found [this](https://github.com/reed-lau/cute-gemm/tree/main) high performance implementation of gemm using cute. The author wrote a series of tutorial for CuTe (in Chinese) on [Zhihu](https://www.zhihu.com/column/c_1696937812497235968). And in one of the tutorials,...

Is it from an latest Tutel version? I didn't see it matches the error below: https://github.com/microsoft/Tutel/blob/main/tutel/custom/custom_kernel.cpp#L33 Another question is if you are using CUDA backend or ROCm backend?

哪里的新版WPS呢?不是自带的WPS吗?根据提示缺少的库在 apt 里面安装,比如 apt install libqt5gui5

原版的是可以安装。能贴一下新版安装错误信息吗?

@Gelip Does Win2003 work? I think it starts to support GPT partition.

@aroenai I don't think so, x64 also works even without CSMWarp. The very early solution had already proved GPT booting for 2K3 working without any special things needed to do....