mmpose
mmpose copied to clipboard
Warmup iterations
Hallo,
I have a general question about the one of the hyperparameters used in the config for the HRNet and ResNet model
- What exactly warmup_iterations do ? It is a hyperparamter ?
- Are batches not used in HRNet and ResNet models ? Because config files its not mentioned . Or it is mentioned with some other name ?
- For human pose estimation what is the loss function used while training the model?
- Warm-up is a commonly used training strategy that gradually increases the learning rate at the beginning of the training, instead of directly starting with a large learning rate.
- The batch size is controlled by
samples_per_gpu
and the GPU number used in training. For example, ifsamples_per_gpu
is 64 and 8 GPUs are used, the batch size is 64x8=512. - It depends on the algorithm. Please check the
loss_keypoint
in the config.
- Warm-up is a commonly used training strategy that gradually increases the learning rate at the beginning of the training, instead of directly starting with a large learning rate. You meant the lr say given 5e-4 so it epoch 1 will start from 5e-4 ? Is warmup iterations is applicable even if no LR policy is used ?
- In the config the validation and test dataset both are equal . Any specific reason of this ? What problem can occur if train dataset is used for both train and validation , test dataset ?
- Why does the model converge quickly ?
-
If
lr=5e-4, warmup_ratio=0.001, warmup_iters=500
, it means the learning rate will increase from5e-4 * 0.001
to5e-4
during the first 500 epoch. Please check the code in mmcv for details: https://github.com/open-mmlab/mmcv/blob/master/mmcv/runner/hooks/lr_updater.py#L11 And there must be a learning policy for training I think. You can use a constant scheduler if you don't want to change the learning rate. -
It's a common practice to use different data splits for training and validation so you can tell if the model is overfitting.
-
How quickly did your model converge and why do you think it is a problem?
- If
lr=5e-4, warmup_ratio=0.001, warmup_iters=500
, it means the learning rate will increase from5e-4 * 0.001
to5e-4
during the first 500 epoch. Please check the code in mmcv for details: https://github.com/open-mmlab/mmcv/blob/master/mmcv/runner/hooks/lr_updater.py#L11 And there must be a learning policy for training I think. You can use a constant scheduler if you don't want to change the learning rate.- It's a common practice to use different data splits for training and validation so you can tell if the model is overfitting.
- How quickly did your model converge and why do you think it is a problem?
- if no warmup_ratio is given in the config because i havenot given warmup iterations in the config .
- How can we check by looking at loss ? If lr is high then makes the model to learn fasters and converges faster . It may miss that global minima point .
- So if both training and validation dataset are same then model cannot be checked for overfiting ?
Also hrnet and resnet is how many layers model ?
- The warm-up will be applied if the argument
warmup
is set. It will use a default warm-up ratio if the argumentwarmup_ratio
is missing. Please check the code in the given link above for details. - You can plot the loss curve (You can uncomment this line to use the tensorboard visualizer hook) or check the validation performance.
- Yes.
- We provide standard settings for HRNet (e.g. HRNet-w32 and HRNet-w48) and ResNet (e.g. ResNet-50, ResNet-101). Please check the paper for details of the model architecture.
I have generated some output images. How can we change the colour and thickness of the lines used to make skeleton. ?