CREStereo icon indicating copy to clipboard operation
CREStereo copied to clipboard

finetune in the secend batch, loss is nan.

Open If-only1 opened this issue 2 years ago • 4 comments

hi, it's a real nice work! but when I fine-tune the model using your pre-trained model ,the loss in the secend batch be nan. I checked the data input to the model, the left and right image are the original data without any preprocessing, and the disparity is the absolute value. I don't know where is the problem? Can you offer some advice? thanks. the log is follow:

left.max(), left.min(): Tensor(255.0, device=xpux:0) Tensor(0.0, device=xpux:0) right.max(), right.min(): Tensor(255.0, device=xpux:0) Tensor(0.0, device=xpux:0) gt_disp.max(), gt_disp.min(): Tensor(65.625, device=xpux:0) Tensor(0.0, device=xpux:0) valid_mask.max(), valid_mask.min(): Tensor(1.0, device=xpux:0) Tensor(0.0, device=xpux:0) The i-th iteration prediction loss : 0 Tensor(68.409615, device=xpux:0) Tensor(-0.72061765, device=xpux:0) 1 Tensor(69.27495, device=xpux:0) Tensor(-7.1237144, device=xpux:0) 2 Tensor(68.630264, device=xpux:0) Tensor(-2.3412788, device=xpux:0) 3 Tensor(67.001595, device=xpux:0) Tensor(-0.64989996, device=xpux:0) 4 Tensor(67.27512, device=xpux:0) Tensor(-0.53194094, device=xpux:0) 5 Tensor(66.031105, device=xpux:0) Tensor(-1.1353028, device=xpux:0) 6 Tensor(66.7748, device=xpux:0) Tensor(-2.5566366, device=xpux:0) 7 Tensor(66.69823, device=xpux:0) Tensor(-0.30609164, device=xpux:0) 8 Tensor(66.8682, device=xpux:0) Tensor(-0.37459654, device=xpux:0) 9 Tensor(66.893974, device=xpux:0) Tensor(-0.80092835, device=xpux:0) 10 Tensor(66.295364, device=xpux:0) Tensor(-1.110324, device=xpux:0) 11 Tensor(67.22122, device=xpux:0) Tensor(-3.059827, device=xpux:0) 12 Tensor(66.74182, device=xpux:0) Tensor(-0.807206, device=xpux:0) 13 Tensor(66.88104, device=xpux:0) Tensor(-0.45083997, device=xpux:0) 14 Tensor(67.27106, device=xpux:0) Tensor(-0.62685704, device=xpux:0) 15 Tensor(67.43465, device=xpux:0) Tensor(-0.7094991, device=xpux:0) 16 Tensor(67.55379, device=xpux:0) Tensor(-0.38040105, device=xpux:0) 17 Tensor(67.453476, device=xpux:0) Tensor(-1.5267422, device=xpux:0) 18 Tensor(67.46704, device=xpux:0) Tensor(-0.3359019, device=xpux:0) 19 Tensor(67.47497, device=xpux:0) Tensor(-0.32194442, device=xpux:0) Tensor(255.0, device=xpux:0) Tensor(0.0, device=xpux:0) Tensor(255.0, device=xpux:0) Tensor(0.0, device=xpux:0) Tensor(69.34766, device=xpux:0) Tensor(0.0, device=xpux:0) Tensor(1.0, device=xpux:0) Tensor(0.0, device=xpux:0) 0 Tensor(nan, device=xpux:0) Tensor(nan, device=xpux:0) 1 Tensor(nan, device=xpux:0) Tensor(nan, device=xpux:0) 2 Tensor(nan, device=xpux:0) Tensor(nan, device=xpux:0) 3 Tensor(nan, device=xpux:0) Tensor(nan, device=xpux:0) 4 Tensor(nan, device=xpux:0) Tensor(nan, device=xpux:0) 5 Tensor(nan, device=xpux:0) Tensor(nan, device=xpux:0) 6 Tensor(nan, device=xpux:0) Tensor(nan, device=xpux:0) 7 Tensor(nan, device=xpux:0) Tensor(nan, device=xpux:0) 8 Tensor(nan, device=xpux:0) Tensor(nan, device=xpux:0) 9 Tensor(nan, device=xpux:0) Tensor(nan, device=xpu

If-only1 avatar Apr 15 '22 05:04 If-only1

Hello!I have the same problem. Have you solved it?

jim88481 avatar Jun 01 '22 06:06 jim88481

@jim88481 I solved it by using the pytorch implementation https://github.com/ibaiGorordo/CREStereo-Pytorch.

If-only1 avatar Jun 04 '22 10:06 If-only1

@jim88481 I solved it by using the pytorch implementation https://github.com/ibaiGorordo/CREStereo-Pytorch.

OK,thank you so much!

jim88481 avatar Jun 05 '22 06:06 jim88481

@jim88481 I solved it by using the pytorch implementation https://github.com/ibaiGorordo/CREStereo-Pytorch.

Hi, may I ask whether the pytorch implementation has the same performance compared with MegEngine implementation?

WenjiaR avatar Jul 22 '22 13:07 WenjiaR