qiaojiim

Results 1 comments of qiaojiim

I trained and `save_checkpoint` using 4 GPUs, but when I tried to `load_checkpoint` using 1 GPU, I encountered the same issue. I suspect that ZeRO3 splits the model and saves...