CSP
CSP copied to clipboard
what is the meaning about mode_stu and model_tea?
Citation from paper "We also apply the strategy of moving average weights proposed in [45]". Tea stands for teacher, stu for student I suppose. The idea is that the teacher accumulates a moving average of the student model to improve generalisation of the learned model.
[45] Tarvainen, A., Valpola, H.: Mean teachers are better role models: Weight-averaged consistency tar- gets improve semi-supervised deep learning results. In: Advances in neural information processing systems, pp. 1195–1204 (2017)
thank you