Graft-PSMNet icon indicating copy to clipboard operation
Graft-PSMNet copied to clipboard

An Error in retrain_CostAggregation.py.

Open xzjzsa opened this issue 2 years ago • 14 comments

adaptor.load_state_dict(torch.load(args.load_path)['net']) should it be changed to adaptor.load_state_dict(torch.load(args.load_path)['fa_net']) ?

xzjzsa avatar May 31 '22 06:05 xzjzsa

Yes, sorry for the error.

SpadeLiu avatar Jun 02 '22 00:06 SpadeLiu

@SpadeLiu Have you tested network backbone such as ConvNext or Transformer?

xzjzsa avatar Jun 10 '22 09:06 xzjzsa

@SpadeLiu Have you tested network backbone such as ConvNext or Transformer?

Thanks for your suggestion. We have not tested these models due to the lack resources. But I think they will work if they are pretrained on large-scale datasets. Please note the detail that we should use the relative shallow layers of these models.

SpadeLiu avatar Jun 11 '22 01:06 SpadeLiu

@SpadeLiu Thanks for your reply, I found that using deep features of these models does reduce generalization ability and they do not perform as well as simpler models like VGG.

xzjzsa avatar Jun 11 '22 02:06 xzjzsa

@SpadeLiu Thanks for your reply, I found that using deep features of these models does reduce generalization ability and they do not perform as well as simpler models like VGG.

In my opinion, features from shallow layers contain more task-general information, and thus it is more simple to recover (or extract) information for stereo matching with a feature adaptor.

On the other hand, the BN layer might also affect the generalization ability, which is not used in VGG.

SpadeLiu avatar Jun 12 '22 02:06 SpadeLiu

@SpadeLiu I started training the U-net adapter from the checkpoint_baseline_8epoch.tar you provided, but my generalization results on kitti2015 went down. I am using the finalpass version of SceneFlow and I am using a single card but my bachsize is set to 8. Does this make a big difference in performance?

xzjzsa avatar Jun 15 '22 12:06 xzjzsa

@xzjzsa I do not think the batch-size affects the performance a lot in this code. When training the adaptor, can you discard the udc loss and only use the smooth L1 loss? (Please look at the comment in Line 87 of "train_adaptor.py".) We experimentally find this works but we can not explain why. On the other hand, maybe it is better to train the basic model by yourself.

SpadeLiu avatar Jun 16 '22 07:06 SpadeLiu

@SpadeLiu Thanks for your reply. I got the result close to that reported in the paper. Is the result in the paper also that only smooth L1 Loss is used in the training of adapters?

xzjzsa avatar Jun 21 '22 05:06 xzjzsa

@xzjzsa Yes. It seems when the cost aggregation module is fixed, the udc loss will not guide the learning process well.

SpadeLiu avatar Jun 21 '22 08:06 SpadeLiu

@SpadeLiu Thanks!

xzjzsa avatar Jun 21 '22 12:06 xzjzsa

@SpadeLiu After I trained Baseline8 epochs by myself, I tested grafting VGG features with its aggregation module, but the results seemed to be unstable. Do you have the same problems in your training? epoch 1:epe->2.18, >3px->0.0940 epoch 2:epe->1.83, >3px->0.0858 epoch 3:epe->1.71, >3px->0.0688 epoch 4:epe->1.51, >3px->0.0696 epoch 5:epe->3.38, >3px->0.1130 epoch 6:epe->1.66, >3px->0.0734 epoch 7:epe->1.76, >3px->0.0707 epoch 8:epe->9.81, >3px->0.1760

xzjzsa avatar Jun 22 '22 01:06 xzjzsa

@xzjzsa I can not explain this. In my opinion, if the cost volume is built by cosine similairy, the performance variation will not be so large.

SpadeLiu avatar Jun 24 '22 09:06 SpadeLiu

Thanks!

从 Windows 版邮件发送

发件人: SpadeLiu 发送时间: 2022年6月24日 17:24 收件人: SpadeLiu/Graft-PSMNet 抄送: Mr.Xie; Mention 主题: Re: [SpadeLiu/Graft-PSMNet] An Error in retrain_CostAggregation.py.(Issue #3)

@xzjzsa I can not explain this. In my opinion, if the cost volume is built by cosine similairy, the performance variation will not be so large. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

xzjzsa avatar Jun 24 '22 14:06 xzjzsa

@SpadeLiu After I trained Baseline8 epochs by myself, I tested grafting VGG features with its aggregation module, but the results seemed to be unstable. Do you have the same problems in your training? epoch 1:epe->2.18, >3px->0.0940 epoch 2:epe->1.83, >3px->0.0858 epoch 3:epe->1.71, >3px->0.0688 epoch 4:epe->1.51, >3px->0.0696 epoch 5:epe->3.38, >3px->0.1130 epoch 6:epe->1.66, >3px->0.0734 epoch 7:epe->1.76, >3px->0.0707 epoch 8:epe->9.81, >3px->0.1760

I think these experiments once again stress the importance of the feature adaptor and combining two trained modules may not be as simple as we imagine.

SpadeLiu avatar Aug 19 '22 06:08 SpadeLiu