CSRNet-pytorch
CSRNet-pytorch copied to clipboard
Why do you don't use all architecture pretrained model VGG16 ?
I have read your paper and don't understand why you use the first ten layers of VGG-16 with only three pooling layers instead of all architecture pre-trained model VGG16 ? Thanks
I think the reason is that while doing crowd counting, we do not need deep features which contains semantic information. These semantic information might influence the performance since we mainly need shallower feature like edges.
@doubbblek Do you have a paper relevant mention about this? thanks for your answer