pytorch-be-your-own-teacher issues

Resnet network structure

Hi In standard resnet18 from [Pytorch](https://pytorch.org/vision/main/generated/torchvision.models.resnet18.html), when `3,224,224` input is fed, `layer4` should output feature map of size `512x7x7`. However, in your repo, the final layer feature size is `512x28x28`....

pmgautam

Question for your paper MSD

3

Hi， Thanks for sharing this code and it's really helpful. Recently I read your paper:"MSD: Multi-Self-Distillation Learning via Multi-classifiers within Deep Neural Networks".It's a very interesting work and the results...

curryandsun

Experiments results

1

Thank you for sharing the code. I also re-implement the algorithm but can't obtain approximate results like in the paper. I wonder if there are some tricks in training but...

Yuranus

The question for Cross entropy loss

1

Hi, Thanks for sharing the code, but i have some questions In my understanding, middle_output1, middle_output2, and middle_output3 should be different from output. But in the code, they are the...

storm-zhuo

the Question for loss source 3

1

Hi, Thanks for sharing this code. However, it seems that there is no L2 loss of intermediate features in this code. This is my question about **loss source 3** in...

cw1091293482

self-distillation really works?

1

Hi, I recently downloaded your code and did some experiments. I found that the performance of the self-distillation was about the same as just adding label loss at each stage....

LASTLINEK

why beta is much smaller than alpha?

dreamcubeblock

pytorch-be-your-own-teacher
pytorch-be-your-own-teacher copied to clipboard

Metadata

Resnet network structure

Question for your paper MSD

Experiments results

The question for Cross entropy loss

the Question for loss source 3

self-distillation really works?

why beta is much smaller than alpha?

← Metadata

Owner

Metadata

pytorch-be-your-own-teacher pytorch-be-your-own-teacher copied to clipboard

Metadata

Resnet network structure

Question for your paper MSD

Experiments results

The question for Cross entropy loss

the Question for loss source 3

self-distillation really works?

why beta is much smaller than alpha?

← Metadata

Owner

Metadata

pytorch-be-your-own-teacher
pytorch-be-your-own-teacher copied to clipboard