DICL-Flow
DICL-Flow copied to clipboard
Overfitting and difference in results compared to the pretrained checkpoint
Hi,
I am trying to replicate the training results by following the steps mentioned in the README
. However, the model is overfitting after training 3 stages of chairs
and then evaluating on Sintel
.
I have downloaded the pretrained checkpoint
for chairs
provided in the repository and compared the EPEs
to our trained model. And here are the results:
Chairs | Sintel Clean | Sintel Final | |
---|---|---|---|
pretrained | 2.56 | 4.63 | 6.07 |
ours | 1.56 | 7.76 | 15.58 |
As we can see from the table, our model is clearly overfitting on chairs
. Could you please help me to replicate the training in order to match the results with the pretrained model
?
I have also noticed that the paper
mentions a total of 150K iterations
on FlyingChairs
and the README mentions 120 epochs
on each stage of FlyingChairs
. This results in a total of 360 epochs which is more than 150K iterations
. Here is the EPE on chairs
after each stage of training:
stage 0 | stage 1 | stage 2 |
---|---|---|
2.16 | 1.65 | 1.56 |
Could you please provide the correct number of epochs for each stage of chairs?
Hi @prajnan93 ,
Thank you for the interest! FlyingChairs dataset has 22232 training images, and we train the model with a batch size with 64 on it, as shown on the Readme page. Therefore, each epoch of FlyingChairs corresponds to 22232/64 ≈ 350 = 0.35k iterations. In this way, 360 epochs consume around 130k iterations, which is smaller than 150k.
Regarding the overfitting problem, are you using a batch size of 64? If with a smaller batch size, we need to decrease the learning rate correspondingly. For example, if using a bs of 16 instead of 64, we should empirically reduce the lr to 0.25 of it.
Best, Jianyuan