Switchable-Whitening
Switchable-Whitening copied to clipboard
How to early stop the update of branch weights as the paper said?
You can split the parameters into 2 groups in the optimizer as below:
Then you can set the learning rate of SW to zero in the middle of the training.
Can you show some results about how serious the overfitting problem is?
@JiyueWang The initial version of SW is implemented with SVD decomposition, where the lack of early stop would lead to a decrease of about 0.5 scores. The current version of SW uses newton's iteration, which naturally has stochasticity, so the overfitting is alleviated. We observe that the effect of early stop is negligible in this case.