noah-research [CLIFF] Question about training CLIFF

[CLIFF] Question about training CLIFF

Open MooreManor opened this issue 2 years ago • 10 comments

@zhihaolee I tried training CLIFF res50 from scratch.

Here is my exp setting. I used 4 GPUS with syncbatchnorm, and the batch size for per GPU is 64. The dataset setting for training CLIFF on Human3.6M, COCO (your pseudo GT), MPII (your pseudo GT), MPI-INF-3DHP, and 3DPW-train set with partitions 0.4, 0.3, 0.3, 0.1, 0.2 respectively. I didn't use the PARE syncocclusion for augmentation. I used lr 1xe-4 and didn't reduce it in the middle. I used the pretrained res50 weight on COCO instead of Imagenet. The input image size is 256x192.

The paper mentions that the learning rate is set to 1 ×e−4 and reduced by a factor of 10 in the middle.

The time when lr reduces is set to be in the middle (i.e. the 100th epoch). However, I found at about 12th epoch during training, the result was 74.9 which is close to MPJPE 72.0 in your paper. I ran your checkpoint to eval on 3dpw, and got about 73.1 MPJPE on my computer. I think the result is close but with training only 12 epochs. The training epoch num should be 200, but only 12 epochs of training got a similar result.

According to the evaluation performance, do I have to reduce the lr rate until the 100th epoch? Maybe at the 10th epoch?
Is the convergence speed of my experiment similar to yours?

Here is the training log. Did I make something wrong?

Nov 21 '22 15:11 MooreManor

noah-research noah-research copied to clipboard

[CLIFF] Question about training CLIFF

noah-research
noah-research copied to clipboard