What optimizer is used to replace the FusedEmaAdam on the NPU device?
System Info / 系統信息
I am trying to sft on NPU device. However, the FusedEmaAdam relies on cuda, which is not available on NPU device. So, what optimizer is used to replace the FusedEmaAdam on the NPU device?
Information / 问题信息
- [X] The official example scripts / 官方的示例脚本
- [ ] My own modified scripts / 我自己修改的脚本和任务
Reproduction / 复现过程
AttributeError: 'FusedEmaAdamBuilder' object has no attribute 'multi_tensor_ema_adam'
Expected behavior / 期待表现
Run sft successfully on NPU device
I am not very familiar with the acceleration of ADAM on NPU and have not tried it, so we may not be able to provide an effective equivalent operation. @tengjiayan20 Can you give some idea?
System Info / 系統信息
I am trying to sft on NPU device. However, the FusedEmaAdam relies on cuda, which is not available on NPU device. So, what optimizer is used to replace the FusedEmaAdam on the NPU device?
Information / 问题信息
- [x] The official example scripts / 官方的示例脚本
- [ ] My own modified scripts / 我自己修改的脚本和任务
Reproduction / 复现过程
AttributeError: 'FusedEmaAdamBuilder' object has no attribute 'multi_tensor_ema_adam'
Expected behavior / 期待表现
Run sft successfully on NPU device
You can use AdamW optimizer instead of FusedEmaAdam.
@tengjiayan20 Thanks,I tried,it's okay.But I have another question.Now, I have four cards, i want ask for some help, how can i run sft for multi card in single machine. How to modify the envs in finetune script? Such as WORLD_SIZE=1 RANK=0 LOCAL_RANK=0 LOCAL_WORLD_SIZE=1。 @tengjiayan20 @zRzRzRzRzRzRzR
@tengjiayan20 Thanks,I tried,it's okay.But I have another question.Now, I have four cards, i want ask for some help, how can i run sft for multi card in single machine. How to modify the envs in finetune script? Such as
WORLD_SIZE=1 RANK=0 LOCAL_RANK=0 LOCAL_WORLD_SIZE=1。 @tengjiayan20 @zRzRzRzRzRzRzR
Should I run 4 different finetune process with different LOCAL_RANK in 4 device?
We will upload a script soon