Camera-based-Person-ReID Why momentum is set to None?

Hi,

First and foremost, Thanks for your code!

As shown in your code, you set the Momentum of BN to None. While in the testing stage, it means that :

running mean = mean of the last mini-bath running var = var of the last mini-bath

So I wonder why the number of mini-batch influences your results.

I think this is a random problem that if you choose the best mini-batch, you will get the best results, even if you only choose one mini-batch to calculate the running mean and var of the test camera.

Sep 09 '20 07:09 HeliosZhao

Hi. Thanks for your question. Momentum = None is not equivalent to Momentum = 0.0. Momentum = None is a unique feature of PyTorch, it calculates the cumulative moving average. Please check https://pytorch.org/docs/stable/generated/torch.nn.BatchNorm2d.html for more details.

Sep 09 '20 07:09 automan000

Thanks a lot, I find it.

momentum – the value used for the running_mean and running_var computation. Can be set to None for cumulative moving average (i.e. simple average)

So have you tried different momentum like 0.1?

Sep 09 '20 07:09 HeliosZhao

I did not test momentum=0.1. I think the simple average is a better choice since it is robust to the order of mini-batch, although it is slightly different from what is done in the training stage. Please feel free to share your results. Thanks a lot.

Sep 09 '20 07:09 automan000

Hi, I tried momentum=0.1 and it performed bad. Momentum=False is a better choice.

Sep 13 '20 01:09 HeliosZhao

Hi, Thanks for the update. FYI, to reproduce the results in our paper, each experiment should be conducted 10 times. I will leave this issue open so that more people can see your results.

Sep 13 '20 03:09 automan000

Hi, Thanks for the update. FYI, to reproduce the results in our paper, each experiment should be conducted 10 times. I will leave this issue open so that more people can see your results.

As you declared in another issue, you used momentum = None when training. I tired momentum = 0.1/None and found that when using momentum = 0.1, the cross-dataset performance is a lot better than that of momentum=None, if the model is not updated when testing. However, if the model is updated when testing, the result is opposite. Why?

Dec 27 '20 07:12 justopit

@justopit Please give more information about your training sets and testing sets. For Market, Duke, and MSMT, I have the same results as @HeliosZhao: momentum=None is better than momentum=0.1. This post may also provide you useful information.

Meanwhile, please report the average results of 10 times of repeated experiments.

Dec 27 '20 12:12 automan000

Camera-based-Person-ReID Camera-based-Person-ReID copied to clipboard

Why momentum is set to None?

Camera-based-Person-ReID
Camera-based-Person-ReID copied to clipboard