Duo Li
Duo Li
@tarun005 Have you tested with SGD optimizer? Does it drive the training process to convergence?
@tarun005 Although I suppose that BN should be irrelevant to the optimization method, when I used the syncbn by just adding the folder `lib` to $PATH, I met an error...
@815961618 DRN-D-22 | 16.4M DRN-D-38 | 26.5M DRN-D-105 | 54.8M are the numbers of parameters, not the actual size of these files.
@wldeephi Have you solved the problem? I have tried this pretrained model and got reasonable performance. @lxtGH Evaluation process doesn't need a large batch size.
@apli What reproduction results did you get specifically? I ran the command following README.md but got only approximately 60% test_acc with resnet18 learning from resnet34.
@tangbohu I assume that β is 10^3 / batch_size / (feature_map_size)^2, this division occurs in the average function [here](https://github.com/szagoruyko/attention-transfer/blob/master/utils.py#L23) in practice, batch size is set to 128 by default, and...
@Aky97567 @tingtinglu Add `CPU_ONLY := 1` in your Makefile.config, hopefully it will help.
I suppose it would be interesting to add [CSN](https://arxiv.org/abs/1904.02811) and [X3D](https://arxiv.org/abs/2004.04730) by FAIR into the supported model family. I also have an interest in helping implement/review them if time permits.
OK, thanks for adding this new feature.
Could you please tell me how the split of 'train', 'val', 'test' is done in advance in `camvid_loader.py`? As I can see, the original dataset take them as a whole...