YupengZheng comments

Repositories
Issues
Comments

Results 2 comments of


                                            YupengZheng

Training loss curve on V2

In the pretrain, using the Wikipedia dataset and using the learning rate of 1e-4 can help jump out of the local optimal solution, and the loss can be reduced to...

Training loss curve on V2

@toilaluan When I train ICAE with lm_ratio=0, the loss can reach under 0.1. However, when I set lm_ration=0.4, I face the same problem as you, so what's your lm_ratio? By...