ReloJeffrey
Results
3
issues of
ReloJeffrey
For fine-tune the unsupervised pre-trained model, Only the erm reult can be repeat, the co_tuning、bi_tuning accuray is lower than erm
bug
The codebase has provided the training code. But how the reproduce the eval result in the paper 'DeepNet: Scaling Transformers to 1,000 Layers'. Could you please provide the code to...