Zhuosheng Zhang

Results 24 comments of Zhuosheng Zhang

Hi, it may need 8/24 hours to train a base/large model using an A100 GPU, respectively. This may also depend on the exact GPU. As it has been a long...

Hi guys, thanks for your interest. The released models are my reproduced ones using a limited computation resource after my internship finishes. It is possible to obtain better results with...

Hi, there would not be a data leak problem because no gold label is used. We only collect questions for automatic rationale generation.

确实如此。这个typo很诡异。我们会在近期在paper中修正。谢谢!