polestar

Results 1 issues of polestar

if I use accelerate+deepspeed to train a model, and I set deepspeed_config: gradient_accumulation_steps: 8 offload_optimizer_device: cpu offload_param_device: cpu zero3_init_flag: false zero_stage: 2 does the order of the order of backward(),...