Camera-based-Person-ReID
Camera-based-Person-ReID copied to clipboard
Why momentum is set to None?
Hi,
First and foremost, Thanks for your code!
As shown in your code, you set the Momentum of BN to None. While in the testing stage, it means that :
running mean = mean of the last mini-bath running var = var of the last mini-bath
So I wonder why the number of mini-batch influences your results.
I think this is a random problem that if you choose the best mini-batch, you will get the best results, even if you only choose one mini-batch to calculate the running mean and var of the test camera.
Hi. Thanks for your question. Momentum = None is not equivalent to Momentum = 0.0. Momentum = None is a unique feature of PyTorch, it calculates the cumulative moving average. Please check https://pytorch.org/docs/stable/generated/torch.nn.BatchNorm2d.html for more details.
Thanks a lot, I find it.
momentum – the value used for the running_mean and running_var computation. Can be set to None for cumulative moving average (i.e. simple average)
So have you tried different momentum like 0.1?
I did not test momentum=0.1. I think the simple average is a better choice since it is robust to the order of mini-batch, although it is slightly different from what is done in the training stage. Please feel free to share your results. Thanks a lot.
Hi, I tried momentum=0.1 and it performed bad. Momentum=False is a better choice.
Hi, Thanks for the update. FYI, to reproduce the results in our paper, each experiment should be conducted 10 times. I will leave this issue open so that more people can see your results.
Hi, Thanks for the update. FYI, to reproduce the results in our paper, each experiment should be conducted 10 times. I will leave this issue open so that more people can see your results.
As you declared in another issue, you used momentum = None when training. I tired momentum = 0.1/None and found that when using momentum = 0.1, the cross-dataset performance is a lot better than that of momentum=None, if the model is not updated when testing. However, if the model is updated when testing, the result is opposite. Why?
@justopit Please give more information about your training sets and testing sets. For Market, Duke, and MSMT, I have the same results as @HeliosZhao: momentum=None is better than momentum=0.1. This post may also provide you useful information.
Meanwhile, please report the average results of 10 times of repeated experiments.