SnapshotEnsemble Question about the method

Question about the method

Open kayuksel opened this issue 6 years ago • 2 comments

What prevents the model from returning back to the same local minima while having cyclic learning rates or warm restarts? Would it make sense to regularize the model to make different predictions as possible than the existing ensemble up to that point? For example, one can use the Kullback–Leibler or Jensen-Shannon divergence between the averaged predictions of that ensemble and the on-going training for regularization to achieve that in the case of classification. I would love to hear your opinion on this draft idea. Thanks a lot.

Jan 31 '19 00:01 kayuksel

This sounds like a good idea. But I'm not sure whether it will improve the performance significantly, given the computational budget is limited. You may also need to check if people have come up with similar ideas in the context of general ensemble learning. Anyway, I would be curious to know the result if you are about to work on this.

Jan 31 '19 09:01 gaohuang

Thanks for your response. The idea is actually intended to better use the limited computational budget as it tries to prevent visiting the same local minimas again and would rather encourage the exploration. I also can't estimate anything about the performance improvement but believe that it can be something for RL especially; where developing alternative policies is possible. I have a quite full pipeline at the moment but will let you know more about the experiments if I ever pick this up.

Jan 31 '19 15:01 kayuksel

SnapshotEnsemble SnapshotEnsemble copied to clipboard

Question about the method

SnapshotEnsemble
SnapshotEnsemble copied to clipboard