bigscience icon indicating copy to clipboard operation
bigscience copied to clipboard

Why is deepspeed enabled in the Bloom training script?

Open robertLiuLinFeng opened this issue 8 months ago • 0 comments

Why is the value of Zero-State 0 when deepspeed is enabled in the Bloom training script? Can the Bloom model be trained and the loss curve is aligned when deepspeed is disabled? Thanks very much.

DEEPSPEED_ARGS=" \
    --deepspeed \
    --deepspeed_config ${config_json} \
    --zero-stage ${ZERO_STAGE} \
    --deepspeed-activation-checkpointing \
    "

robertLiuLinFeng avatar Nov 01 '23 12:11 robertLiuLinFeng