Ma, Guokai
Ma, Guokai
I see npu FusedAdam is implemented with `torch_npu.npu_apply_adam_w`. In the future when implement new features, does NPU intend to support through torch_npu, or may also implement kernel in DeepSpeed as...
Hi @loadams , currently we are seeking higher test coverage in the area of AutoTP. @Yejing-Lai is currently investigating whether more model coverage is possible. On the other hand. Some...
I tried to run the line: `pytest -m 'seq_inference' unit/` locally and see 3 passed tests with this branch. Yet the same branch get these three test skipped on CI...
The slow running of test is caused by this line. It loads model list from huggingface_hub. Doing it for every test is slow. A proper fix should make it persistent...
@loadams the slow test had been fixed with latest commit. How the test run around 13 minutes on my local environment. The strange thing is on my local environment, there...
The skip message is truncated because of 80 column limit when pytest is running. We can expand the 'COLUMNS' env variable to 140 to see skip message. Another issue is...
Intel Extension for PyTorch 2.1 is released. Will update this PR to change workflow to pytorch 2.1 accordingly.
Hi @loadams @tjruwase, we are adding back tensor parallel UT into CPU inference workflow. One issue we met is github instance "ubuntu-20.04" has only one CPU and two cores. We...
Note currently there is a CPU test failure due to compatibility between https://github.com/microsoft/DeepSpeed/commit/4fc181b01077521ba42379013ce91a1c294e5d8e and pytorch 2.1. We are working on a fix. Will go back to you when a PR...
This failure should be fixed by https://github.com/microsoft/DeepSpeed/pull/4578. We should merge #4578 into this PR so they can be tested together on CPU host > Note currently there is a CPU...