3DCrowdNet_RELEASE
3DCrowdNet_RELEASE copied to clipboard
Cannot reproduce without pre-trained ResNet-50 weights of xiao2018simple
Hi,
I tried to reproduce table 8 without pre-trained ResNet-50 weights of xiao2018simple.
My training command is python train.py --amp --gpu 0 --cfg ../assets/yaml/3dpw_crowd.yml
and the config file is :
trainset_3d: ['Human36M', 'MuCo']
trainset_2d: ['MSCOCO', 'MPII']
testset: 'PW3D'
lr_dec_epoch: [30]
end_epoch: 40
lr: 0.00025 #0.001/4
lr_backbone: 0.0001
lr_dec_factor: 10
However, I got very strange results on 3dpw as below (I evaluate every epoch):
Do you have any idea about this? Thank you!
Hi,
if you are not using the pretrained backbone, please set ‘lr’ and ‘lr_backbone’ the same
I changed my config file as below:
trainset_3d: ['Human36M', 'MuCo']
trainset_2d: ['MSCOCO', 'MPII']
testset: 'PW3D'
lr_dec_epoch: [30]
end_epoch: 40
lr: 0.0005
lr_backbone: 0.0005
lr_dec_factor: 10
# modify batch size
train_batch_size: 128
test_batch_size: 128
However, the results were still weird.
Hi,
Yes, the results seem weird.
-
Are you evaluating on 3DPW-Crowd?
-
How can you train that fast? I don’t remember exactly, but it took about more than 12hours to train for 6epochs. You are training for 40epochs with half batch size. 2days are not enough.
- I evaluate on 3DPW. not Crowd.
- I used RTX3090 with batch size 128. The training time is 0.91h / epoch.
Wow, I didn't know that RTX 3090 is that better than RTX 2080 ti.
I thought you were testing on 3DPW-Crowd, since you are using 3dpw_crowd.yml
My training command is python train.py --amp --gpu 0 --cfg ../assets/yaml/3dpw_crowd.yml
Can you share your full code via github repo? Some information is confusing. Increasing errors seem really weird.
So sorry that I pasted the wrong command.
My training command is python train.py --amp --gpu 0 --cfg ../assets/yaml/3dpw.yml
This is my full code: https://github.com/mimiliaogo/3DCrowdNet-Mimi
Thank you so much!
Thanks for sharing the code. I can't find a critical bug...
Here are a few suggestions.
-
Could you try testing with the
test.py
? Due to the evaluation per epoch, there could be unintentional overwriting in the testing data during the process. -
Could you visualize the training data? Visualize GT joints and meshes on the image. There could be corruption during downloading. And is there any change in
MPII.py
code? -
Could you train with this config info and see the result? It shouldn't take long. It's to see which dataset is causing the increasing error.
trainset_3d: []
trainset_2d: ['MSCOCO']
testset: 'PW3D'
lr_dec_epoch: [30]
end_epoch: 40
lr: 0.001
lr_backbone: 0.001
lr_dec_factor: 10
# modify batch size
train_batch_size: 128
test_batch_size: 128
Hi,
I tried your conifg as 3., the results seem normal.
So maybe the problem is from training data. I will try to visualize them.
BTW, there is no change in MPII.py code.
@hongsukchoi, when I train your model with Human3.6M and MuCo respectively, both of them will have increasing errors.
I visualize the GT keypoints and joints, and the results seem normal (maybe a little inaccurate, but mostly right).
However, I still don't know why these two datasets will lead to increasing errors...