Jan Nie
Results
2
comments of
Jan Nie
Facing the same problem, using deepspeed zero2+offload to save model is easily killed when saving a shard of model. Please add at least a warning to inform the insufficient of...
I only got 86.8% accuracy on ace05 dataset, while your result is 89.9% in your paper, is there any solution? I just run the demo.sh.. I guess this is your...