Use PVTv2 as backbone for custom model
Hello,
I want to apply PVTv2 as backbone for my human pose estimation model, which is not based on mmcv. How should I use it correctly?
Thanks in advance!
Hi,
I have a same question as comment above. I want to use PVTv2 as the backbone for tracking model. I tried to connect tensors of four stages and apply zero padding to an unique tensor, but I think that is not a right way. How should I use your model correctly?
Thank you.
@hoangtv2000 Hello, did you divide PVT2 into 4 stages or using an iteration method? I have some confusion
@hoangtv2000 Hello, did you divide PVT2 into 4 stages or using an iteration method? I have some confusion
I just concatenate 4 tensors by shape[1] and apply zero padding shape[2] and shape[3] to tensors which have smaller shape (shape[2] and shape[3]).
But I think that is not a right way, the concatenated feature map is four-times bigger than my modified ResNet. Specifically the concatenated feature map: [1, 1024, 64, 64] And the ResNet feature map: [1, 1024, 32, 32]
Right. I am dividing the four stages separately for better understanding. May you please help me to make a separate four stages? The below implementation is only for PVT but I need for PVT V2 https://github.com/ofsoundof/LocalViT/blob/main/models/pvt.py
Right. I am dividing the four stages separately for better understanding. May you please help me to make a separate four stages? The below implementation is only for PVT but I need for PVT V2 https://github.com/ofsoundof/LocalViT/blob/main/models/pvt.py
Hi,
The output of https://github.com/whai362/PVT/blob/16eabba29aca820e785a8def1ec73bb805c2daec/detection/pvt_v2.py#L308 is a list containing feature maps output at each stage. In other words, there're 4 feature maps in total, as mentioned in the PVTv1 paper (see Figure 3).
PS: PVTv2 applies 3 improvements on the base of PVTv1. The network structure is almost the same.