Yunsheng Li
Yunsheng Li
The Micro-Facatorized Pointwise convolution has nothing to do with depthwise convolution. It is composed of two group convolution with a shuffle between them. Actually, the code is organized a little...
Which dataset are you using? Do you use the default hyperparameters or you have modified them?
You just need to modify the DataLoader I wrote.
Maybe you no longer need self.id_to_trainid.
No, there is not any inconsistency between the code and paper. During evaluation, I only evaluate 16 of the common classes.
s, n, c, ks represents the stride, number of repeated layers, network width and kernel sizes. The multiplication of c1 and c2 is the expansion between input channels to the...
The smallest model does have some stability issues, but it should have less than 0.5% variance. I'm curious whether you can reproduce the result with the released model.
The largest model should be stable. In the experiments I did, the larger the model is, the stabler the performance will be.
It is a little hard to decide. For training without using SSL, I find stopping at 80000 iterations is best. When I continue to train with more iterations, overfitting will...
Not really. The best result is always shown when iteration is around 80000. It is unnecessary to validate all snapshots.