MVSNet icon indicating copy to clipboard operation
MVSNet copied to clipboard

Evaluation accuracy and compleness using shared pretrained weights

Open tatsy opened this issue 5 years ago • 5 comments

Hi @YoYo000,

I am sorry to bother you many times, but I have a question about the evaluation for the pretrained weights shared in README.md.

In my experiment, I made your code working with Python 3.x and ran with the shared pretrained weights (I used the one trained with the DTU dataset). Then I evaluated the output point clouds using the evaluation codes shared by the official website of the DTU dataset.

I ran the program in the following environment:

OS: Ubuntu 16.04 LTS (by Docker) GPU: NVIDIA GeForce GTX 1080Ti CUDA: 9.0 cuDNN: 7.6.4 Python: 3.7 TensorFlow: 1.15.0

After this experiment, I found that the evaluation scores, namely the accuracy and the completeness, for the output point clouds of both the MVSNet and R-MVSNet are slightly worse than those reported in your papers (even the current code on GitHub uses UNet while those for the papers use UniNet).

Specifically, the accuracy and completeness I got were as follows (values in the parentheses are those in your papers):

MVSNet Command: python test.py --regularization 3DCNNs --inverse_depth False --max_w 1152 --max_h 864 --max_d 192 --interval_scale 1.06 Accuracy: 0.4846 (0.396) Completeness: 0.4908 (0.527) Overall: 0.4877 (0.462)

R-MVSNet Command: python test.py --regularization GRU --inverse_depth False --max_w 1600 --max_h 1200 --max_d 256 --interval_scale 0.8 Accuracy: 0.4418 (0.385) Completeness: 0.5102 (0.459) Overall: 0.4760 (0.422)

I do not believe the versions of Python, CUDA and cuDNN matter for the results. So, I guess my way of evaluation could be different from yours. Did you update the evaluation code over that in the DTU dataset? (I checked the evaluation code correctly gives the accuracy and completeness in your papers for furu, camp, and tola)

I'd appreciate your advice to reproduce your results in the papers. Thank you very much for your help.

tatsy avatar Sep 01 '20 12:09 tatsy

Hi @tatsy, The major difference between this repo and the two papers is the depth map fusion step. In the paper I described the fusion strategy as visibility fusion - visibility filter - average fusion - visibility filter. However, the implementation of this part is based on Altizure internal library so I cannot release the corresponding code. Instead, I modified the open-sourced Fusibile to chain the whole pipeline. Its method is something like average fusion - visibility filter and is slightly worse than the proposed fusion method.

For R-MVSNet, the variational refinement part is not released for the same reason.

Screenshot 2020-09-05 at 10 24 09 AM

YoYo000 avatar Sep 05 '20 02:09 YoYo000

Hi @YoYo000, I am very sorry to be away from the discussion and thank you very much for your kind advice.

So, according to your advice, the difference is twofold. I think the first one, namely the variational depth map refinement, can be implemented by following the R-MVSNet paper.

For the second one, please let me ask you further questions. While you said that your fusion pipeline is provided by fusibile, which includes average fusion and visibility filter. From my understanding,

  • average fusion means the averaging of positions and colors of projected points.
  • visibility filter means the process to reject a point which is not visible from at least three views.

I confirmed that the above two are certainly included in fusibile. So, the remaining two is visibility fusion and visibility filter. I think the visibility filter is the process that is described in Depth Map Filter in Sec 4.2 of MVSNet paper, and visibility fusion is the one describe in Merrel's depth fusion paper: http://graphics.stanford.edu/~pmerrell/Merrell_DepthMapFusion07.pdf

I would like to ask you whether my understanding is correct. If so, I think I can implement them by myself to reproduce them. If there are some difference, I'd really appreciate for it if you would elaborate on the difference.

Thank you very much.

tatsy avatar Oct 06 '20 04:10 tatsy

Exactly! The visibility fusion is described in Merrel's depth fusion paper. I would suggest first to try the visibility fusion - the variational refinement is actually a little complicated.

BTW I think many other works (PointMVSNet, CasMVSNet) also use this fusion pipeline (But I haven't tried yet)

YoYo000 avatar Oct 06 '20 04:10 YoYo000

Hi @YoYo000, thank you very much for your reply. So, I will start again to try reproducing your result.

Also, thank you very much for your information about another fusion pipeline. I have known this repo but did not noticed about the fusion pipeline. I am going to test it right away!

tatsy avatar Oct 06 '20 04:10 tatsy

Sorry, I want to ask how to get the accuracy and completeness?

LiYaolab avatar Oct 08 '21 13:10 LiYaolab