CrossStagePartialNetworks icon indicating copy to clipboard operation
CrossStagePartialNetworks copied to clipboard

Training Steps Mismatch in the paper and the code in ImageNet Experiments

Open Chaimmoon opened this issue 4 years ago • 38 comments

Hi,

In ImageNet Experiments, the paper said that it should be trained for 800 epochs:

image

However, in the code, it said that it should be trained for 80 epochs:

image

So there is a big difference……

Besides, I try to re-implement in PyTorch, and the ACC is 7~8 points behind your method. The network architecture and number of parameters is the same as your Darknet results……

Best, Mu

Chaimmoon avatar May 03 '20 14:05 Chaimmoon

@Chaimmoon

Thank you for point out the typos. It should be 800,000, which is same in the cfg.

I have only implemented CSPDensenet and CSPDarknet with Pytorch. Following is the results of (CSP)Densenet-{121, 169, 201, 264} with PyTorch. image and my PyTorch implemented darknet53 and cspdarknet53 get 76.3/92.9 and 76.9/93.3 top-1/top-5 accuracy with 224x224 input resolution, respectively.

You should make sure the BN layers and activation functions are same as provided cfg file.

WongKinYiu avatar May 03 '20 14:05 WongKinYiu

@Chaimmoon

this is my PyTorch implementation of CSPDarknet. darknet.py.txt

I borrow some functions from mmdetection and mmcv. the main difference between CSPDarknet and CSPResNe(X)t is CSPDarknet use darknet_layer and CSPResNe(X)t use resne(x)t_layer.

            x = down_layer(x)
            x1, x2 = x.chunk(2, dim=1)
            x2 = darknet_layer(x2)
            x = torch.cat([x1,x2], 1)
            x = tran_layer(x)

WongKinYiu avatar May 03 '20 15:05 WongKinYiu

@Chaimmoon

Thank you for point out the typos. It should be 800,000, which is same in the cfg.

I have only implemented CSPDensenet and CSPDarknet with Pytorch. Following is the results of (CSP)Densenet-{121, 169, 201, 264} with PyTorch. image and my PyTorch implemented darknet53 and cspdarknet53 get 76.3/92.9 and 76.9/93.3 top-1/top-5 accuracy with 224x224 input resolution, respectively.

You should make sure the BN layers and activation functions are same as provided cfg file.

@WongKinYiu

Thanks for your reply!

I implemented the ResNet10, ResNet50 and ResNeXt50. The results are not quite good as your paper said... (Besides, can you provide the cfg file for the ResNet10_CSP? The architectures for ResNet10 and 50 are quite different.)

As for the BN, it should be torch.nn.BatchNorm2d, and the activation function should be torch.nn.LeakyReLU, right?

Can you provide your PyTorch code? Thanks

Mu

Best, Mu

Chaimmoon avatar May 03 '20 15:05 Chaimmoon

@Chaimmoon

My PyTorch code is posted on https://github.com/WongKinYiu/CrossStagePartialNetworks/issues/24#issuecomment-623125410.

I am sorry about that I can not release my lightweight models due to some issues. You can try to follow the rule of ResNet50->CSPResNet50 to modify ResNet10->CSPResNet10.

WongKinYiu avatar May 03 '20 15:05 WongKinYiu

@WongKinYiu Thanks for your work! I have a question about [sam] layers

in https://github.com/AlexeyAB/darknet/issues/3708#issuecomment-518618199 SAM module consists of one [convolutional] layer and one sam layer like following 62534082-d1fc3b00-b87a-11e9-8665-adc6f719d3d8

while in https://github.com/AlexeyAB/darknet/issues/5355#issuecomment-619859913 SAM module consists of two [convolutional] layers and one sam layer ,not one [convolutional] layer, like following

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=512
activation=mish

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=logistic

[sam]
from=-2

what's more,in https://github.com/AlexeyAB/darknet/issues/5355#issuecomment-619859913 the [convolutional] layer in front of the sam layer has pad=1,while in https://github.com/AlexeyAB/darknet/issues/3708#issuecomment-518618199, the [convolutional] layer in front of the sam layer dose not have pad=1,

I want to know which [sam] layer is correct?

nyj-ocean avatar May 05 '20 10:05 nyj-ocean

@nyj-ocean Hello,

  1. In https://github.com/AlexeyAB/darknet/issues/5355#issuecomment-619859913
[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=512
activation=mish

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=logistic

[sam]
from=-2

which is sam module. image

  1. In https://github.com/AlexeyAB/darknet/issues/3708#issuecomment-518618199 which is the usage of sam layer. image

  2. pad=1 and pad=0 are same when convolutional filter size is 1x1.

WongKinYiu avatar May 05 '20 11:05 WongKinYiu

@WongKinYiu Thanks for your reply I want to add the SAM module to YOLOv3,. can you help me check whether the following cfg is right?

SAM-to-yolov3.cfg.txt

nyj-ocean avatar May 05 '20 14:05 nyj-ocean

@nyj-ocean

the latest [sam] block seems at different layer when compare with 1st and 2nd [sam] block in your cfg file.

and in my previous experiments, i used sam layer as: SAM-to-yolov3.cfg.txt

WongKinYiu avatar May 05 '20 14:05 WongKinYiu

@WongKinYiu Thanks for your help! I noticed that the yolov4 paper has mentioned a modified SAM block. Is the SAM block in your provided SAM-to-yolov3.cfg.txt https://github.com/WongKinYiu/CrossStagePartialNetworks/issues/24#issuecomment-624093575 equal to the modified SAM block mentioned in yolov4?

nyj-ocean avatar May 05 '20 15:05 nyj-ocean

yes, it is same. and the comparison of w/w\o sam is posted on 1st table of readme in this repo.

WongKinYiu avatar May 05 '20 15:05 WongKinYiu

@WongKinYiu thanks for your help!!!

nyj-ocean avatar May 06 '20 03:05 nyj-ocean

@WongKinYiu

Hi, I have checked the network structure and number of parameters in my CSPResNet/CSPResNeXt PyTorch implementation, which is the same as what you reported in your Github README file, including nn.BachNorm2d, nn.LeakyReLu, Training epochs, batch size and learning rate schedule. I also have a close look at your DarkNet PyTorch implementation. However, the ACC point is still below yours...

My Results:

  • CSPResNet50: Prec@1 75.772 Prec@5 92.716 (Paper results: 76.6 % 93.3%)
  • CSPResNeXt50: Prec@1 76.328 Prec@5 93.058 (Paper results: 77.9 % 94.0%)

Thanks!

Chaimmoon avatar May 08 '20 09:05 Chaimmoon

@Chaimmoon

I am not sure it is important or not, I just follow https://pjreddie.com/darknet/imagenet/.

And I think gets a little bit lower accuracy is normal, since darknet use 256x256 for validation, and I guess your PyTorch code use 224x224 instead. My CSPDarknet53 PyTorch (224x224) implementation also gets 0.6% lower top-1 accuracy than Darknet (256x256) implementation.

Could you share your code of CSPResNet / CSPResNeXt, I would like to upload the implementation and results to pytorch branch if it is OK.

WongKinYiu avatar May 08 '20 10:05 WongKinYiu

@WongKinYiu I'm sorry to bother you again.

I notice that the modified SAM in yolov4 paper is reference to the CBAM paper.

However, I also find that ThunderNet paper also design a SAM.

so I want to know:

  1. The SAM in CBAM paper is same as the SAM in ThunderNet paper?

  2. In yolov4 paper, the modified SAM is reference to the CBAM paper. But in https://github.com/AlexeyAB/darknet/issues/3708#issuecomment-518583264, LukeAI said the [sam] layer is for thundernet. Are the two statements in conflict? which one is correct?

nyj-ocean avatar May 09 '20 08:05 nyj-ocean

@nyj-ocean

There are many kind of channel attention module (CAM) spatial attention module (SAM) in the literature. For example SENet and SKNet proposed different kind of CAM, and CBAM and ThunderNet prposed different kind of SAM. In general, we will cite the first paper or the most similar paper or both in related work. So the answer of your question is:

  1. The SAM in CBAM paper is same as the SAM in ThunderNet paper?

No, they are different.

  1. In yolov4 paper, the modified SAM is reference to the CBAM paper. But in AlexeyAB/darknet#3708 (comment), LukeAI said the [sam] layer is for thundernet. Are the two statements in conflict? which one is correct?

The CBAM is the first paper which proposed SAM, we cite it in yolov4 paper. The ThunderNet prposed the most similar SAM module as ours, we cite it in cspnet paper. SAM in CBAM: image SAM in ThunderNet: image

WongKinYiu avatar May 09 '20 09:05 WongKinYiu

@WongKinYiu Thanks for your reply. yolov4 paper modify SAM from spatial-wise attention to point-wise attention, So the SAM module before modified in yolov4 (that is spatial-wise attention ) is similar to the SAM module in CBAM paper?

nyj-ocean avatar May 09 '20 09:05 nyj-ocean

yes, all of different kind of sam modules produce the attention of spatial.

WongKinYiu avatar May 09 '20 09:05 WongKinYiu

@WongKinYiu thanks a lot

nyj-ocean avatar May 09 '20 09:05 nyj-ocean

@Chaimmoon

I am not sure it is important or not, I just follow https://pjreddie.com/darknet/imagenet/.

And I think gets a little bit lower accuracy is normal, since darknet use 256x256 for validation, and I guess your PyTorch code use 224x224 instead. My CSPDarknet53 PyTorch (224x224) implementation also gets 0.6% lower top-1 accuracy than Darknet (256x256) implementation.

Could you share your code of CSPResNet / CSPResNeXt, I would like to upload the implementation and results to pytorch branch if it is OK.

Hi @WongKinYiu

Thanks for your reply! I think that during training and testing, the DarkNet framework keeps the image size as 256256. However, for common PyTorch training, the training size is 224224, and the test size is 256*256. Is my understanding right?

Chaimmoon avatar May 11 '20 01:05 Chaimmoon

@Chaimmoon

it is depend on your code. the most common testing protocol in PyTorch is single-crop (224x224). https://pytorch.org/docs/stable/torchvision/models.html and the other common testing protocols nowadays are 10-crop (224x224 * 5-crop * flip), 5-crop(224x224 * (center+ 4 corners)), and full (256x256).

WongKinYiu avatar May 11 '20 02:05 WongKinYiu

@WongKinYiu I'm sorry to bother you again. I want to produce the picture about anchors of yolov3,like following . but I don't know how to do it. Can you tell me how to produce this picture about anchors? 2020-05-13 16-52-58屏幕截图

nyj-ocean avatar May 13 '20 11:05 nyj-ocean

@nyj-ocean

i do not know too, i always use the anchors which yolo9000 calculated.

WongKinYiu avatar May 13 '20 12:05 WongKinYiu

You can calculate new anchors by using this command: ./darknet detector calc_anchors coco.data -num_of_clusters 9 -width 512 -height 512 -show

image

AlexeyAB avatar May 13 '20 12:05 AlexeyAB

@WongKinYiu thanks for your reply

@AlexeyAB Thank you so much!! It helps me a lot! If the background color of cloud.png is white, it will be better for me. How can I change the background color of cloud.png from black to white?

nyj-ocean avatar May 14 '20 13:05 nyj-ocean

  • img = cv::Scalar::all(255); https://github.com/AlexeyAB/darknet/blob/bef28445e57cd560fa3d0a24af98a562d289135b/src/image_opencv.cpp#L1472

  • cv::rectangle(img, pt1, pt2, CV_RGB(0, 0, 0), 1, 8, 0); https://github.com/AlexeyAB/darknet/blob/bef28445e57cd560fa3d0a24af98a562d289135b/src/image_opencv.cpp#L1490

AlexeyAB avatar May 14 '20 14:05 AlexeyAB

@AlexeyAB great! thanks a lot

nyj-ocean avatar May 14 '20 19:05 nyj-ocean

@AlexeyAB sorry to bother you again. I ues the following command to generate my cloud.png on my own dataset. ./darknet detector calc_anchors my-own-dataset.data -num_of_clusters 9 -width 608 -height 608 -show The following figure is my cloud.png cloud

I find that there are many black spare parts in my own clond.png However, there is almost no black spare parts in cloud.png of coco dataset. The anchor almost fills the whole cloud.png of coco dataset (seen https://github.com/WongKinYiu/CrossStagePartialNetworks/issues/24#issuecomment-627941826)

  • Is there any problem with my own clond.png ? or is there any problem with my anchor that I generated on my own dataset?

  • How can I eliminate the black spare parts in my own clond.png

nyj-ocean avatar May 21 '20 10:05 nyj-ocean

i guess images in your dataset are form videos.

WongKinYiu avatar May 21 '20 12:05 WongKinYiu

What is the black spare? There is no problem.

AlexeyAB avatar May 21 '20 12:05 AlexeyAB

@AlexeyAB Theblack spareparts is like the following:

1

there are many black spare parts in my own clond.png However, there is almost no black spare parts in cloud.png of coco dataset. (seen https://github.com/WongKinYiu/CrossStagePartialNetworks/issues/24#issuecomment-627941826)

  • Why are there many black spare parts in my own cloud.png ? Is it normal?

  • I want to eliminate these black spare parts in my own clond.png. How can I eliminate these black spare parts?

nyj-ocean avatar May 21 '20 14:05 nyj-ocean