polestar
Results
1
issues of
polestar
if I use accelerate+deepspeed to train a model, and I set deepspeed_config: gradient_accumulation_steps: 8 offload_optimizer_device: cpu offload_param_device: cpu zero3_init_flag: false zero_stage: 2 does the order of the order of backward(),...