Aoqun Jin

Results 3 comments of Aoqun Jin

@Forainest When using deepspeed, you can install apex, which will be automatically used in deepspeed. That works for me. ```bash git clone https://github.com/NVIDIA/apex.git cd apex git checkout 741bdf50825a97664db08574981962d66436d16a pip install...

@clement-swk When using deepspeed, you can install apex, which will be automatically used in deepspeed. That works for me. ```bash git clone https://github.com/NVIDIA/apex.git cd apex git checkout 741bdf50825a97664db08574981962d66436d16a pip install...

@clement-swk You can also try removing `accelerator.is_main_process`. This will avoid having to call `save_model` only in the main process without being able to get the states of other devices. In...