Gradient_Starvation
Gradient_Starvation copied to clipboard
Question about SD loss in cmnist code
Hi, I just came across this very very interesting project of yours, I was quite excited while reading the paper on arxiv.
While looking at the implementation of Spectral Decoupling for cmnist, I saw that flags.sd_coef is multipled with both the train loss as well as the SD loss term. I am referring to this line: https://github.com/mohammadpz/Gradient_Starvation/blob/main/Figure_4_and_table_1/cmnist.py#L155
Is this the intended usage or should it be something like loss = train_nll + (flags.sd_coef * sd)?
Thank you.