STCN
STCN copied to clipboard
Training About
First of all, thank you for your contribution and excellent work.
Recently, I have been trying to replicate your work. I first pre-trained on static images and BL30K, then main train on DAVIS and YouTubeVOS, and finally used the trained weights for inference. However, when I uploaded the inference results for 2017testdev and 2018YouTubeVOS, the metrics were much lower than yours (about 10 points lower). Just to clarify, I trained on two 3080TI GPUs, and I've ensured that the hyperparameters are exactly as you specified. However, the inference results suggest that I may have overlooked some details. Could you please advise on what might be causing this discrepancy? Also, I noticed in your paper that you mentioned freezing batch normalization layers during main training. Do I need to explicitly freeze them, as I haven't seen any freezing operations in the code?
The default setting, without any modification, should work. See https://github.com/hkchengrex/STCN/issues/108
I'll check the training details again and thank you for your reply