ReloJeffrey

Results 3 issues of ReloJeffrey

For fine-tune the unsupervised pre-trained model, Only the erm reult can be repeat, the co_tuning、bi_tuning accuray is lower than erm

bug

The codebase has provided the training code. But how the reproduce the eval result in the paper 'DeepNet: Scaling Transformers to 1,000 Layers'. Could you please provide the code to...