Haiyang Huang comments

Results 5 comments of


                                            Haiyang Huang

[BUG] Running DeepSpeed with MoE inference leads to CUDA illegal memory access and NaN activation

The problem seems to be rooted from the ds_qkv_gemm implementation under FP16. This kernel works fine when handling FP32 inputs. However, when running under FP16, only the inp_norm can be...

[BUG] Running DeepSpeed with MoE inference leads to CUDA illegal memory access and NaN activation

Here is a screenshot created by the same script with different precision. On the left is the results of a dense layer given FP32 and the right is the results...

[BUG] Running DeepSpeed with MoE inference leads to CUDA illegal memory access and NaN activation

Sure, here is the script I'm using. I made some modification to deepspeed/module_inject/replace_module.py to ensure the args and flags are respected by the deepspeed.init_inference() function. Besides the fp16 and kernel...

Solving environment: failed with initial frozen solve. Retrying with flexible solve.

Same problem here on ubuntu.

Compiled PSMP too slow and fails the tests

Thank you for your reply! I passed the make test by changing some configuration I was using, but I am not sure if I really solved all the problems I...