Haojun Jiang(蒋昊峻)
Haojun Jiang(蒋昊峻)
This dataset is unavailable now on TwentyBN website. Could anyone share it on drive please? Thanks a lot.
Is there anyone train the resattentionnet on ImageNet? The paper didn't provide the batchsize for ImageNet training. So I set the batchsize=256/lr=0.1 which is a common setting, but the training...
I find the output channel of [conv3_x/B_fc1](https://github.com/implus/SKNet/blob/17dd7086d4959be0c8c6ccce52e4fd8187836770/models/sknet101.prototxt#L1683) is 16 which is quite confusing. As sknet paper mentioned, the first fully connected layer(B_fc1)'s output channel d following the equation(4) which is...
Hi, many thanks for the pytorch implementation. But you missed a bn-relu module which is implemented in [cypw/DPNs](https://github.com/cypw/DPNs). Read the [code](https://github.com/cypw/DPNs/blob/5766ebef5ba7a1e79a1e6be71878fb016b67d4b2/settings/symbol_dpn-92.py#L62) for more details.
https://github.com/megvii-model/SinglePathOneShot/blob/36eed6cf083497ffa9cfe7b8da25bb0b6ba5a452/src/Supernet/train.py#L213 It seems that author want the candidates' flops distribute uniformly among [290, 360]. However, I random sampled 12500 candidates and calculate their flops. The nature distribution of the ShuffleNet...
1. The authors mentioned about "We initialize an LSTM classifier with the weights learned by the encoder LSTM from this model. "in their paper, but I am a beginner and...
I am wondering what parameters did you use when you rescaled the model?There are many parameters in magic_init.py, such as -t -nit -d. Could you please give more details?Thanks.
@rabeehk It seems all these embeddings are initialized from a pytorch default gassian normal distribution with N(0, 1).
https://github.com/facebookresearch/ijepa/blob/52c1ae95d05f743e000e8f10a1f3a79b10cff048/src/train.py#L295-L298 https://github.com/facebookresearch/ijepa/blob/52c1ae95d05f743e000e8f10a1f3a79b10cff048/src/models/vision_transformer.py#L422-L425