EasyLM
EasyLM copied to clipboard
Llama 7b Pretraining Dtype
Hi, thank you so much for releasing this wonderful code!
I notice in your examples/pretrain_llama_7b.sh
, the dtype
is set to fp32
, which seems to make activations fp32
. However, I think it's more common to make activations bf16
? Also, I notice that it seems like the param_dtype is always set to fp32
.
Could you please elaborate a bit on this choice? Thank you very much!