Jan Nie

Results 2 comments of Jan Nie

Facing the same problem, using deepspeed zero2+offload to save model is easily killed when saving a shard of model. Please add at least a warning to inform the insufficient of...

I only got 86.8% accuracy on ace05 dataset, while your result is 89.9% in your paper, is there any solution? I just run the demo.sh.. I guess this is your...