Ahmed Taha

Results 15 comments of Ahmed Taha

Do you know where is [1] code/implementation? [1] Correlation Congruence for Knowledge Distillation

Do you remember the changes to make the code build successfully on tF 1.4?

Thanks but `\pagenumbering{gobble}` seems simpler -- Just one line. Do you have any problems with it?

Thanks for your reply. BTW, `\pagenumbering{gobble}` passes the PDF eXpress as well.

I experienced this issue. It seems related to this other [issue](https://github.com/4uiiurz1/pytorch-adacos/issues/10). My fix is to change the optimizer from ``` optimizer = optim.SGD(filter(lambda p: p.requires_grad, model.parameters()), lr=args.lr, momentum=args.momentum, weight_decay=args.weight_decay) ```...

I found another issue that raises the nan value. The [scale variable s](https://github.com/4uiiurz1/pytorch-adacos/blob/35c086cc07087657595e1d10eaddefb3b3a46f35/metrics.py#L36) should be updated during training only,i.e., using the training split. However, it is updated every time the...

Hi Hattie, I checked-out your commit, but I can't run it because it leverage datasets.ImageFolder. My datasets doesn't have train, val, test directories. Accordingly, I can't run/evaluate your code. It...

Hi Hattie, I am getting mixed signals here -- "2-3% higher than the reported result in the paper". Which version achieves this 2-3 higher performance?

That makes more sense. It is good that your cs-kd implementation achieves ~6% higher results. I am no longer confident about my cs-kd baseline. I used multiple samplers in my...

Hi LeoLiu2n, 1. `in_channels_order` has no value for ResNets, but vital for [DenseNets](https://github.com/ahmdtaha/knowledge_evolution/blob/a3f2eb2eed7accb86ad1af2a15c13e4a9654fe16/models/split_densenet.py#L42). The [paper Appendix.B](https://arxiv.org/pdf/2103.05152.pdf) illustrates the importance of this variable (Fig. 10 and 11) 2. As mentioned in...