CasMVSNet_pl Point cloud is worse than original project on DTU

Hi, thank you for your code! I run your pre-trained model on DTU dataset but the point cloud is not as complete as that the original project generated.

Jul 08 '21 09:07 YuhsiHu

Please give more details, what pretrained model do you use and what eval command do you use? I just reran python eval.py --dataset_name dtu --scan scan1 with this model https://github.com/kwea123/CasMVSNet_pl/releases/tag/v1.0 and this is what I got Screenshot from 2021-07-08 20-53-06 It contains 29.44 M points. I don't see any problem.

Jul 08 '21 11:07 kwea123

Yes, I use your pre-trained model from the url you post: _ckpt_epoch_10.ckpt. And I use this command: python eval.py
--root_dir $DTUTESTPATH
--dataset_name dtu
--split test
--ckpt_path $CKPT_FILE

The scan1's point cloud file I get is 238M(16.66M points), which is smaller than that I got from original project(418M). What should I do to generate 29.44M points? Thank you for your time!

Jul 09 '21 03:07 YuhsiHu

It is strange, do you use the latest master code? Also please make sure python=3.7, and all the package version is the same as in requirements.txt. Other than that, I don't know what the reason is. There is no randomness in the algorithm... (different CUDA version causes different randomness, so I ask if you use the same package version)

Jul 09 '21 03:07 kwea123

In order to keep the version consistent, I did not use RTX3090. I use 2080ti. I did use python 3.7, and cudatoolkit==10.1, cudnn==8.0.5, pytorch==1.7.1, torchvision==0.8.2(0.5.0 will report error). And other packages are the same as requirements.

It is strange. I will test it on other machines and once there are new results, I will give feedback in time. Thank you again for your prompt reply.

Jul 09 '21 06:07 YuhsiHu

I think GPU is not a problem. In the requirements.txt pytorch==1.4.0, can you try using this version?

Jul 09 '21 07:07 kwea123

I tried pytorch==1.4.0, the result is the same as pytorch==1.7.1,the point cloud file of scan1 is 238M.

Jul 09 '21 07:07 YuhsiHu

Didn't you change some default parameters by chance? Can you post your eval.py?

Jul 09 '21 08:07 kwea123

Hmm, your eval.py looks good, and I used it and got the correct result (29.44M points). I had another thought. How about setting torch.backends.cudnn.benchmark = False at line 19? This flag also introduces some randomness.

Jul 09 '21 09:07 kwea123

It’s a pity that it didn’t work.

Jul 09 '21 12:07 YuhsiHu

Can you add --save_visual argument? It will save the visual depth. I want to make sure it looks the same. This is my depth_visual_0000.jpg. And proba_visual_0000

Jul 09 '21 13:07 kwea123

It is strange that they are different. I did download the latest version of the code. depth_visual_0000

Jul 11 '21 13:07 YuhsiHu

OK, so If the code part is the same, then the problem can only come from the data. Do you use the preprocessed data https://github.com/kwea123/CasMVSNet_pl#data-download? Or do you use the original DTU data?

Jul 14 '21 00:07 kwea123

Thanks a lot! I think the folder named "Rectified" I used is wrong, the pictures in it are 640X512. They should be 1600X1200, right? What is the architecture of your training and testing dataset? Mine traning dataset is: Cameras txt files Depths -- scan1 pfm files -- scan2 -- ... Depths_raw -- scan1 pfm files -- scan2 -- ... Rectified -- scan1 png files -- scan2 -- ...

And when traning, I think the code use pictures in Rectified, which is different from original cascade-MVSNet.

Jul 14 '21 05:07 YuhsiHu

I think almost everyone in this field uses MVSNet's preprocessed files, including me. So in Depths and Rectified, there are folders named scanXX_train, these contain images of size 640x512, and are images that are used in training. In testing, the images in scanXX is used, and resize to img_wh specified in the argument. I forgot what Depth_raw is, but it seems it's not used for training (and testing).

So I'm very confused as it seems that the data you use is the same as mine, I don't understand why there is still difference...

Jul 15 '21 00:07 kwea123

When testing, do we need to use DTU testing data from MVSNet-pytorch? The images in dataset are 1600X1200. Or we just use the data in traning dataset(which is 640X512)? I have downloaded the project again and tested, but the results are the same as mine before.

Jul 22 '21 07:07 YuhsiHu

Hi, I am now facing the same problem as you now. Do you have solved it? And by the way, how can I see the tensorboard without a events.out.file?

Jul 23 '21 04:07 stillwang96

@YuhsiHu When testing, I read the images and parameters from full size images, you can see the code. @stillwang96 I don't understand your second question, what do you want to see without a event file?

Jul 23 '21 08:07 kwea123

The loss curve just as the log1.png and log2.png as your assets show. I don't know how to view the tensorboard. Thank you.

Jul 23 '21 08:07 stillwang96

that's just screenshots. When you train, the program will generate .event files under logs. Then in another terminal, run tensorboard --logdir logs to visualize it. This kind of question is better to ask on stackoverflow. And this is not related to this issue, next time please open a new issue.

Jul 23 '21 09:07 kwea123

I download the dataset from MVSNet project. The images under dtu_training/Rectified are 640X512, and they have another testing dataset(1600X1200). The structure of testing dataset is: scanXXX: cams, images, pairs.txt. But in your dtu.py, you read the images under Rectified folders. So I have to replace the images with 1600X1200?

Jul 23 '21 09:07 YuhsiHu

It's weired. I'm not sure but it seems I stuck in the same issue from the mvsnerf which is based on this project. ;)

Aug 20 '21 06:08 frspring

I download the dataset from MVSNet project. The images under dtu_training/Rectified are 640X512, and they have another testing dataset(1600X1200). The structure of testing dataset is: scanXXX: cams, images, pairs.txt. But in your dtu.py, you read the images under Rectified folders. So I have to replace the images with 1600X1200?

Though this issue is reported a long time ago, for anyone else who may encounter a similar problem, I think this is the solution. i.e. when testing we use 1600x1200 image(Recitified_raw folder) and will crop it to 1152x864 (args.img_wh), instead of 640x512 image(Recitified folder) which will be resized to 1152x864 by opencv's bilinear interpolation.

I got 28.88M points with 1600x1200 image and only 15.72M points with 640x512 image for scan1.

Jun 22 '23 08:06 xingchen2022

I download the dataset from MVSNet project. The images under dtu_training/Rectified are 640X512, and they have another testing dataset(1600X1200). The structure of testing dataset is: scanXXX: cams, images, pairs.txt. But in your dtu.py, you read the images under Rectified folders. So I have to replace the images with 1600X1200?

Though this issue is reported a long time ago, for anyone else who may encounter a similar problem, I think this is the solution. i.e. when testing we use 1600x1200 image(Recitified_raw folder) and will crop it to 1152x864 (args.img_wh), instead of 640x512 image(Recitified folder) which will be resized to 1152x864 by opencv's bilinear interpolation.

I got 28.88M points with 1600x1200 image and only 15.72M points with 640x512 image for scan1.

Exactly. I will close this issue.

Jun 22 '23 12:06 YuhsiHu

CasMVSNet_pl CasMVSNet_pl copied to clipboard

Point cloud is worse than original project on DTU

CasMVSNet_pl
CasMVSNet_pl copied to clipboard