PWC-Net
PWC-Net copied to clipboard
Getting better results than the reported ones
Hi @deqings,
This is very elegant work, Dr. Sun. Implementing your architecture using TensorFlow and making a couple changes, I was able to get better results than the ones reported at https://arxiv.org/abs/1709.02371. Please believe me when I write that sharing this with you isn't meant to criticize the value of your research in any way. I just want to single out these two low-hanging fruits so that you may also benefit from them, should be you be interested in doing so.
The official multistep schedule discussed in your paper is as follows: Slong 1.2M iters training, batch size 8 + Sfine 500k iters finetuning, batch size 4). Ours is Slong only, 1.2M iters, batch size 8, on a mix of FlyingChairs
and FlyingThings3DHalfRes
. FlyingThings3DHalfRes
is our own version of FlyingThings3D
where every input image pair and groundtruth flow has been downsampled by two in each dimension. We also use a different set of augmentation techniques (details in augment.py
).
The motivation for using FlyingThings3DHalfRes
is as follows: the average flow magnitude on the MPI-Sintel
dataset is only 13.5, while the average flow magnitudes on FlyingChairs
and FlyingThings3D
are 11.1 and 38, respectively. In our experiments, finetuning on FlyingThings3D
would only yield worse results on MPI-Sintel
.
We got more stable results by using a half-resolution version of the FlyingThings3D
dataset with an average flow magnitude of 19, much closer to FlyingChairs
and MPI-Sintel
in that respect. We then trained on a mix of the FlyingChairs
and FlyingThings3DHalfRes
datasets.
Our results are shown below:
Model name | Notebooks | FlyingChairs (384x512) AEPE | Sintel clean (436x1024) AEPE | Sintel final (436x1024) AEPE |
---|---|---|---|---|
pwcnet-lg-6-2-multisteps-chairsthingsmix |
train | 1.44 (notebook) | 2.60 (notebook) | 3.70 (notebook) |
pwcnet-sm-6-2-multisteps-chairsthingsmix |
train | 1.71 (notebook) | 2.96 (notebook) | 3.83 (notebook) |
The official, reported results are as follow:
Thank you again for this very impressive work!
Respectfully, -- Phil
Hi Phil,
Thank you so much for sharing your findings. They are very interesting and I will look into them.
Meanwhile, I will keep this issue open so that the community would benefit from your findings.
Thanks again.
Respectfully,
Deqing
You're most welcome!
Hi,
This query is regarding the tuning of model. Can you pl confirm whether your PyTorch model/ Caffe model under this git repo. is fine-tuned on KITTI data set or not.
There seems to be confusion regarding the model shared under this git repo. since the README file mentioned under Caffe model reads as follows :
"Method description
The model here is PWC-Net with a larger feature pyramid extractor (PWC-Net-feature-uparrow, second row in Table5(a) of Our CVPR 2018 paper below)." ,
when referring to the paper , performance of your model on KITTI-2015 (FL-all ) is 39.8% for uparrow . Although we have not tested the performance of your Caffe model on KITTI data set ( assuming both Caffe and PyTorch model are trained on same data set ), when we tested your PyTorch model on 200 training images provided as the part of KITTI development kit (http://www.cvlibs.net/datasets/kitti/eval_scene_flow.php?benchmark=flow ) to verify the performance of model on KITTI data set , we got error in the order of 9.7% (FL-all) on these 200 training images which is quite different from what mentioned in second row in Table5(a) of your CVPR 2018 paper.
So can you pl confirm whether the model share under this git repo. (both Caffe and/or PyTorch) is fine-tuned on KITTI data set or not.
Respectfully,
Romi Srivastava
Hi @rockingromi I believe your concerns are addressed in issue #26. If your question isn't completely answered there, would you consider deleting your comment here and re-open issue #26 instead? It will help others who may wonder about the same issue if all answers re: finetuning on KITTI are kept in the same thread. Respectfully, -- Phil
Thank you phil :)