Add Ascend NPU as a backend
Description & Motivation
Ascend is a full-stack AI computing infrastructure for industry applications and services based on Huawei Ascend processors and software. For more information about Ascend, see Ascend Community.
CANN (Compute Architecture of Neural Networks), developped by Huawei, is a heterogeneous computing architecture for AI.
Pytorch has officially announced support for Ascend NPU (through key PrivateUse1), please see the PrivateUse1 tutorial here.
Provide new backend support for pytorch-lighting, allowing users who use Ascend NPU to also use the convenient development and acceleration capabilities provided by pytorch-lighting.
Pitch
I'd like to add a new accelerator, and register it into Accelerators, which make it possible that customer can use this new backend by: "accelerator='npu'".
Alternatives
No response
Additional context
I have wrote a demo, please refer to: https://github.com/Lightning-AI/pytorch-lightning/pull/19308
cc @borda
@lantiga Good day, Could you please give me some suggestions? Thanks.
@hipudding have you managed how to solve this issue?
@hipudding have you managed how to solve this issue?
Not yet. I have create a Draft PR, which is a demo for this new backend. I need reply from community, This will guide my subsequent development tasks. However, I haven't gotten any response from the community yet, this issue is currently pending.
@hipudding
Hello,
I have recently utilized the draft code here for training with PyTorch Lightning on NPU. It has proven to be quite useful, and I would like to extend my gratitude for your contribution. I am hopeful that the pending pull request (PR) can be reviewed and merged into the official repository promptly.
Thank you once again for your valuable work.
Best regards
Thank you once again for your valuable work.
i think i could be made as extension repo
i think i could be made as extension repo
@Borda Thanks for your reply. Could you please tell me how to do with this extension repo? Is there any guidance on extension development? Or just use monkey patch.
i think i could be made as extension repo
@Borda Thanks for your reply. Could you please tell me how to do with this extension repo? Is there any guidance on extension development? Or just use monkey patch.
if I wanna use lightning in NPU Ascend now, could you give me some draft templates or suggestions ? Thanks!
i think i could be made as extension repo
@Borda Thanks for your reply. Could you please tell me how to do with this extension repo? Is there any guidance on extension development? Or just use monkey patch.
if I wanna use lightning in NPU Ascend now, could you give me some draft templates or suggestions ? Thanks!
Yes, here's the demo.
Hi, @hipudding, Any update now? I have used your draft code for training. I found that although it runs pretty well with minimal changes, the speed is quite slow (910B vs V100). I wonder if the problem stems from torch_npu or the lightning framework. Thank you in advance.
Hi, @hipudding, Any update now? I have used your draft code for training. I found that although it runs pretty well with minimal changes, the speed is quite slow (910B vs V100). I wonder if the problem stems from torch_npu or the lightning framework. Thank you in advance.
Thanks for using this PR. Actually, I didn't do any analysis about the performace, If you are not using any strategy, it only a simple wrapper of torch(torch_npu), I think you can use a simple demo that only use torch(torch_npu) to see where is the root cause of this performance issue.
Thanks for your quick reply! I have found something strange when I train the autoencoder, the speed is normal and 2x better than V100. However, when I train the Transformer-based model, the speed is extremely slow (10x slower).
Thanks for your quick reply! I have found something strange when I train the autoencoder, the speed is normal and 2x better than V100. However, when I train the Transformer-based model, the speed is extremely slow (10x slower).
Sorry, I'm not an expert in this area and I can't give you any advice.
we are trying to evaluate possibility of running pl on huawei NPU. Thanks for your contribution. I agree with @Borda that making it an extension repo would be better.
2 example with be
https://github.com/Lightning-AI/lightning-Graphcore
https://github.com/Lightning-AI/lightning-Habana
We are currently pending procure the hardware. Looking forward for your opinion.
we are trying to evaluate possibility of running pl on huawei NPU. Thanks for your contribution. I agree with @Borda that making it an extension repo would be better.
2 example with be https://github.com/Lightning-AI/lightning-Graphcore https://github.com/Lightning-AI/lightning-Habana
We are currently pending procure the hardware. Looking forward for your opinion.
Thanks. We will try to make it as a extension repo.