mean-teacher
mean-teacher copied to clipboard
is alpha set wrong?
In the paper, it is said alpha should be 0.99 at the beginning (when global_step is small) and should be 0.999 at the end (when global_step is large), however, in the code:
alpha = min(1 - 1 / (global_step + 1), alpha)
following this, alpha is 0 when global_step is small, and is alpha (this is set as 0.99 from parameters) when global_step is >99. The code seems different what the paper presented. The paper indicates a code of
alpha = max(1 - 1 / (global_step + 1), alpha)
does anyone find issues here?
I have the same confusion. What's more, alpha is a function with global_step, so when batch_size change, the step of every Epoch is also change. But in the paper, it said that alpha was relative with ramp up epoch.