litgpt icon indicating copy to clipboard operation
litgpt copied to clipboard

Deepspeed and bf16-true

Open rasbt opened this issue 2 years ago • 1 comments

In the finetuning scripts, we only allow

precision: Literal["bf16-true", "32-true"] = "bf16-true",

But we also use DeepSpeed when devices > 1. However, in this case, you'd get a

ValueError: `precision='bf16-true')` is not supported in DeepSpeed. `precision` must be one of: ('32-true', '16-mixed', 'bf16-mixed').

Should we allow bf16-mixed, or should we switch to FSDP? Or something else?

rasbt avatar Jun 06 '23 22:06 rasbt

We can choose the precision based on whether deepspeed is used. I guess @awaelchli manually changed the precision value when trying out deepspeed in https://github.com/Lightning-AI/lit-llama/pull/128 (where this code originally comes from).

Note that FSDP also wouldn't work with bf16-true.

carmocca avatar Jun 06 '23 23:06 carmocca