DeepSpeed [REQUEST] BF16 mixed precision => grad accum in fp32

[REQUEST] BF16 mixed precision => grad accum in fp32

Open stas00 opened this issue 1 year ago • 8 comments

Is your feature request related to a problem? Please describe.

We have proven with the BLOOM training that BF16 is by far more superior for mixed precision training than FP16, using Megatron-Deepspeed

But the latter is very complex, it'd be much easier for folks to use standalone ZeRO for training in bf16 mixed precision.

But for this to work we need ZeRO to support grad accumulation in fp32, similar to the recently added BF16Optimizer.

So this is the feature request to backport BF16Optimizer's fp32 grad accumulation to ZeRO-1,2,3.

Once this is done I can safely tell those who are interested in an easier to solution to use ZeRO.

@tjruwase, @jeffra

Sep 23 '22 20:09 stas00

@tjruwase, would it be possible to implement this? We are ready to start using ZeRO-3/bf16 for the multi-modal training.

Thank you very much!

Oct 10 '22 16:10 stas00

@stas00, is it better to close this or #2768? They are the same thing, right?

Jan 31 '23 18:01 tjruwase

Hi Tunji - you're the owner so it's up to you to decide. The new one is a duplicate of this one, so typically the earliest one stays.

And I don't agree with the other request that it should be hardcoded to fp32, it should be a user choice. Though most likely a sensible default should be fp32 for bf16 mixed precision training.

Jan 31 '23 18:01 stas00

Makes sense, will close the newer one and reference this appropriately.

Yes, the accumulation type will be configurable. Hopefully, we should have a WIP pushed later this week. It would be great to get your usual feedback as we iterate on a solution.

Jan 31 '23 18:01 tjruwase

Fantastic news, Tunji. Thank you.

And, yes, we would be happy to experiment with your WIP PR.

Jan 31 '23 18:01 stas00

Amazing, thanks!

Jan 31 '23 21:01 michaelroyzen

hi I also met the same problem. @tjruwase have you found a solution?

Feb 11 '23 06:02 bestbzw

Great, looking forward to see this new release!

Feb 11 '23 15:02 danyang-rainbow

Any update on this @tjruwase?

Feb 24 '23 06:02 michaelroyzen

Please see: https://github.com/microsoft/DeepSpeed/pull/2847

Feb 24 '23 06:02 stas00

Closing as completed by #2847.

Aug 10 '23 17:08 tjruwase

DeepSpeed DeepSpeed copied to clipboard

[REQUEST] BF16 mixed precision => grad accum in fp32

DeepSpeed
DeepSpeed copied to clipboard