oneflow
oneflow copied to clipboard
Add cuda amp scaler for eager
eager amp 支持 scaler ,eager 可以做完整的 amp 训练。
https://github.com/Oneflow-Inc/OneTeam/issues/1754
export ONEFLOW_VM_COMPUTE_ON_WORKER_THREAD=0之后 eager amp测试结果:
无论是训练速度还是显存占用相比于fp32模式都有较大的提升。
Speed stats:
Speed stats:
Speed stats:
Speed stats:
Speed stats: