SHMCU comments

Results 28 comments of


                                            SHMCU

freeze randomly at training with 100% GPU usage but no error

@zhbb1989 Did you find the solution? I am having the same issue.

freeze randomly at training with 100% GPU usage but no error

I don't remember clearly, but seems it is a problem of the cudnn version, cuda driver version, and pytorch version.I used Pytorch 1.3, and cuda 10.2 or 10.1 or 10.0....

Losses

I think to make mean teacher work, you have to set the consistency_weight to some value. In the mean teacher pytorch webpage, it is set to 100.0. The logit_distance_cost is...

how to train with unlabeled data

There are classification loss and consistency loss. You can set the arguments according to the command example provided in the readme.md to get the code rung. Oh, before doing that...

Cannot reproduce the error rate 10.08+/-0.41 with mean teacher + ResNet on 1000 label

Thank you! Are the results in the paper obtained by Tensorflow or Pytorch implementations? I am using the Pytorch code for training the shakeshake26 network on 1000 label. I set...

Cannot reproduce the error rate 10.08+/-0.41 with mean teacher + ResNet on 1000 label

Hello Tarvaina, I found I missed one setting, the MSE error between the two different logits output by the student model. In the appendix, it says the cost of the...

Cannot reproduce the error rate 10.08+/-0.41 with mean teacher + ResNet on 1000 label

Thank you Tarvaina! I have not run this script yet. I will try cifar10_test.py

Cannot reproduce the error rate 10.08+/-0.41 with mean teacher + ResNet on 1000 label

I have tried cifar10_test.py. For 4000 labels, it can reach to the reported performance. However, on 1000 labels, it can only reach to 82.33%. I kept all the hyper parameters...

Cannot reproduce the error rate 10.08+/-0.41 with mean teacher + ResNet on 1000 label

Now I can reproduce the result on 1000 labels. With the hyper parameter settings in the README.md, I trained it on one GPU. I trained it for really long epochs...

Cannot reproduce the error rate 10.08+/-0.41 with mean teacher + ResNet on 1000 label

Is that because when training on multiple GPU, the loss is computed on each GPU separately and then gather and average together. The average loss is actually larger than the...