JierunChen comments

Results 51 comments of


                                            JierunChen

about load model

Hi @1072010918 , the pre-trained weights can be loaded by specifying the weight file in the [config](https://github.com/JierunChen/FasterNet/blob/master/detection/configs/fasternet/mask_rcnn_fasternet_l_fpn_1x_coco.py) files.

Some problems in section 3.1

@juan19941228 Yes, equation (2) only compares the memory access for spatial feature extraction and does not account for the memory access by the pointwise convolution (PWConv).

Some problems in section 3.1

@abcsimple Hi, the comparison of equations (2) and (3) is based on a reasonable assumption that the width or the number of channels when using DWConv is generally higher than...

slicing slower than split_cat

@wsy-yjys Hi, the slicing mode can be slower because of the feature map clone (see the code ```x = x.clone() ```). Such clone is necessary to avoid modifying the input,...

slicing slower than split_cat

@wsy-yjys Hi, ```shortcut = x``` is a shallow copy, which can be much faster than the deep copy version of ```x = x.clone()```. Therefore, the implementation as you suggested would:...

Some problem 2

Hi, we haven't done it yet. You are welcome to monitor any follow-up work or try it by yourself. Note that you may put some add-ons, e.g., the squeeze and...

Some questions about “PConv + PWConv“

@LKAMING97 Hi, T-shaped Conv requires non-trial implementation and thus we adopt out-of-the-box PConv and PWConv. The combination of PConv and PWConv also has lower FLOPs and fewer parameters compared to...

Some questions about “PConv + PWConv“

@LKAMING97 Hi, the PConv can be changed into 1D, whose effectiveness depends on the input redundancy of your task.

Some questions about “PConv + PWConv“

@LKAMING97 Hi, the "evaluation command" you mentioned refers to the evaluation of performance, e.g., accuracy, regardless of the latency. Therefore, "fuse_conv_bn" is not compulsory. You may also turn it on...

why not use activation functions after downsampling convolutions？

@1920230345 Hi, we did not conduct an ablation study on this. We suggest empirical experiments for different FasterNet variants, as further incorporating activation functions may increase or decrease the model...