P3Depth icon indicating copy to clipboard operation
P3Depth copied to clipboard

Can you give me some advice on training the model?

Open BayMaxBHL opened this issue 2 years ago • 13 comments

After setting up the environment, I used NYU dataset for training, but the training results were very strange. The loss function converged slowly, rmse kept increasing, and delta kept decreasing.

BayMaxBHL avatar Nov 10 '22 01:11 BayMaxBHL

image

BayMaxBHL avatar Nov 10 '22 01:11 BayMaxBHL

image

BayMaxBHL avatar Nov 10 '22 01:11 BayMaxBHL

image

BayMaxBHL avatar Nov 10 '22 01:11 BayMaxBHL

image

BayMaxBHL avatar Nov 10 '22 01:11 BayMaxBHL

image

BayMaxBHL avatar Nov 10 '22 01:11 BayMaxBHL

The only changes I made to the code were to write a script to generate the CSV with a batch-size of 8.

BayMaxBHL avatar Nov 10 '22 01:11 BayMaxBHL

After setting up the environment, I used NYU dataset for training, but the training results were very strange. The loss function converged slowly, rmse kept increasing, and delta kept decreasing. Can you share your code, I can't run it with the current

haifengwu205 avatar Feb 09 '23 13:02 haifengwu205

@BayMaxBHL Hello,can you run the test code? I run it with a size mismatch, as follows: size mismatch for net.coords: copying a param with shape torch.Size([8, 480, 640, 2]) from checkpoint, the shape in current model is torch.Size([16, 480, 640, 2]).

haifengwu205 avatar Feb 14 '23 03:02 haifengwu205

@haifengwu205 确实是用不了,我这不晒出来的结果就是不收敛嘛。rmse还卡卡往上涨,人都麻了。

BayMaxBHL avatar Feb 16 '23 02:02 BayMaxBHL

@haifengwu205 确实是用不了,我这不晒出来的结果就是不收敛嘛。rmse还卡卡往上涨,人都麻了。

请问这个代码用的是多大数据的NYU呀

hutingz avatar Apr 13 '23 11:04 hutingz

Hi. I'm also trying to train the model myself and it does not converge as well. Does anyone have some solution so far? Thanks in advance!

macromogic avatar May 14 '23 06:05 macromogic

It's a similar situation to me. I used NYU dataset, and the result crashed as hell. It seems the model learned a completed wrong thing. image 1697630602121 1697630674325

zhaorui-tan avatar Oct 18 '23 12:10 zhaorui-tan

Why did I not output any results after using the train.sh script?

jianqiaowang-wjq avatar Oct 20 '23 03:10 jianqiaowang-wjq