bert-stable-fine-tuning
bert-stable-fine-tuning copied to clipboard
Loss surface axis
Hi Authors,
Thanks for your impressive paper. I'm very interested in your implementation of the loss surfaces. I have checked the original loss surface paper Li et al., 2018. I was wondering why you set the axis to θf−θp and θs−θp in Figure 7.
In my understanding, you are using them as two directions instead of random directions. But why θf locate at θf−θp=1 and θs locate at θs−θp=1.
Could you explain more on this and hopefully share your code for generating this surface?
Also, in your paper, you said that there is a barrier between θf θs. However, it looks like there also exists a similar barrier between θf θp. If so, how θp gradually reach θf?
Looking forward to your reply.