Haojun Jiang(蒋昊峻) issues

Results 9 issues of


                                            Haojun Jiang(蒋昊峻)

Could anyone share a copy of Jester dataset？

This dataset is unavailable now on TwentyBN website. Could anyone share it on drive please? Thanks a lot.

Questions about the performance on ImageNet

Is there anyone train the resattentionnet on ImageNet? The paper didn't provide the batchsize for ImageNet training. So I set the batchsize=256/lr=0.1 which is a common setting, but the training...

Why SKNet101 conv3_x/B_fc1's output channel is 16？

I find the output channel of [conv3_x/B_fc1](https://github.com/implus/SKNet/blob/17dd7086d4959be0c8c6ccce52e4fd8187836770/models/sknet101.prototxt#L1683) is 16 which is quite confusing. As sknet paper mentioned, the first fully connected layer(B_fc1)'s output channel d following the equation(4) which is...

A bn-relu module before classifier is missing in the code.

Hi, many thanks for the pytorch implementation. But you missed a bn-relu module which is implemented in [cypw/DPNs](https://github.com/cypw/DPNs). Read the [code](https://github.com/cypw/DPNs/blob/5766ebef5ba7a1e79a1e6be71878fb016b67d4b2/settings/symbol_dpn-92.py#L62) for more details.

What is the purpose of setting flops bins when training SuperNet?

https://github.com/megvii-model/SinglePathOneShot/blob/36eed6cf083497ffa9cfe7b8da25bb0b6ba5a452/src/Supernet/train.py#L213 It seems that author want the candidates' flops distribute uniformly among [290, 360]. However, I random sampled 12500 candidates and calculate their flops. The nature distribution of the ShuffleNet...

Haojun Jiang(蒋昊峻)

Could anyone share a copy of Jester dataset？

Questions about the performance on ImageNet

Why SKNet101 conv3_x/B_fc1's output channel is 16？

A bn-relu module before classifier is missing in the code.

What is the purpose of setting flops bins when training SuperNet?

Questions about LSTM_classifier

Problem about weight rescaling technique

What is the strategy for initializing the task_embedding, layer_id_embeddings, and adapters_block_type embeddings?

Why normalize twice since the target encoder also normalize the feature at last？