Cream icon indicating copy to clipboard operation
Cream copied to clipboard

the RuntimeError when run save_logits.py

Open ywdong opened this issue 2 years ago • 2 comments

Dear Authors,

Very impressive work. But when I use the code to save teacher logits, this RuntimeError happened. Any ideas?

Traceback (most recent call last): File "main.py", line 595, in Traceback (most recent call last): File "main.py", line 595, in main(args, config) File "main.py", line 187, in main args, config, model, criterion, data_loader_train, optimizer, epoch, mixup_fn, lr_scheduler, loss_scaler) File "main.py", line 365, in train_one_epoch_distill_using_saved_logits loss = criterion(outputs, outputs_teacher) File "/data/miniconda3/envs/env-3.6.8/lib/python3.6/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/data/miniconda3/envs/env-3.6.8/lib/python3.6/site-packages/torch/nn/modules/loss.py", line 1048, in forward ignore_index=self.ignore_index, reduction=self.reduction) File "/data/miniconda3/envs/env-3.6.8/lib/python3.6/site-packages/torch/nn/functional.py", line 2690, in cross_entropy main(args, config) File "main.py", line 187, in main return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction) File "/data/miniconda3/envs/env-3.6.8/lib/python3.6/site-packages/torch/nn/functional.py", line 2385, in nll_loss args, config, model, criterion, data_loader_train, optimizer, epoch, mixup_fn, lr_scheduler, loss_scaler) File "main.py", line 365, in train_one_epoch_distill_using_saved_logits ret = torch._C._nn.nll_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index) RuntimeError: Expected object of scalar type Long but got scalar type Float for argument #2 'target' in call to _thnn_nll_loss_forward loss = criterion(outputs, outputs_teacher) File "/data/miniconda3/envs/env-3.6.8/lib/python3.6/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/data/miniconda3/envs/env-3.6.8/lib/python3.6/site-packages/torch/nn/modules/loss.py", line 1048, in forward ignore_index=self.ignore_index, reduction=self.reduction) File "/data/miniconda3/envs/env-3.6.8/lib/python3.6/site-packages/torch/nn/functional.py", line 2690, in cross_entropy return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction) File "/data/miniconda3/envs/env-3.6.8/lib/python3.6/site-packages/torch/nn/functional.py", line 2385, in nll_loss ret = torch._C._nn.nll_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index) RuntimeError: Expected object of scalar type Long but got scalar type Float for argument #2 'target' in call to _thnn_nll_loss_forward Traceback (most recent call last): File "main.py", line 595, in main(args, config) File "main.py", line 187, in main args, config, model, criterion, data_loader_train, optimizer, epoch, mixup_fn, lr_scheduler, loss_scaler) File "main.py", line 365, in train_one_epoch_distill_using_saved_logits loss = criterion(outputs, outputs_teacher) File "/data/miniconda3/envs/env-3.6.8/lib/python3.6/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/data/miniconda3/envs/env-3.6.8/lib/python3.6/site-packages/torch/nn/modules/loss.py", line 1048, in forward ignore_index=self.ignore_index, reduction=self.reduction) File "/data/miniconda3/envs/env-3.6.8/lib/python3.6/site-packages/torch/nn/functional.py", line 2690, in cross_entropy return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction) File "/data/miniconda3/envs/env-3.6.8/lib/python3.6/site-packages/torch/nn/functional.py", line 2385, in nll_loss ret = torch._C._nn.nll_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index) RuntimeError: Expected object of scalar type Long but got scalar type Float for argument #2 'target' in call to _thnn_nll_loss_forward Traceback (most recent call last): File "main.py", line 595, in main(args, config) File "main.py", line 187, in main args, config, model, criterion, data_loader_train, optimizer, epoch, mixup_fn, lr_scheduler, loss_scaler) File "main.py", line 365, in train_one_epoch_distill_using_saved_logits loss = criterion(outputs, outputs_teacher) File "/data/miniconda3/envs/env-3.6.8/lib/python3.6/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/data/miniconda3/envs/env-3.6.8/lib/python3.6/site-packages/torch/nn/modules/loss.py", line 1048, in forward ignore_index=self.ignore_index, reduction=self.reduction) File "/data/miniconda3/envs/env-3.6.8/lib/python3.6/site-packages/torch/nn/functional.py", line 2690, in cross_entropy return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction) File "/data/miniconda3/envs/env-3.6.8/lib/python3.6/site-packages/torch/nn/functional.py", line 2385, in nll_loss ret = torch._C._nn.nll_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index) RuntimeError: Expected object of scalar type Long but got scalar type Float for argument #2 'target' in call to _thnn_nll_loss_forward Traceback (most recent call last): File "/data/miniconda3/envs/env-3.6.8/lib/python3.6/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/data/miniconda3/envs/env-3.6.8/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/data/miniconda3/envs/env-3.6.8/lib/python3.6/site-packages/torch/distributed/launch.py", line 340, in main() File "/data/miniconda3/envs/env-3.6.8/lib/python3.6/site-packages/torch/distributed/launch.py", line 326, in main sigkill_handler(signal.SIGTERM, None) # not coming back File "/data/miniconda3/envs/env-3.6.8/lib/python3.6/site-packages/torch/distributed/launch.py", line 301, in sigkill_handler raise subprocess.CalledProcessError(returncode=last_return_code, cmd=cmd)

ywdong avatar Oct 11 '22 03:10 ywdong

Thanks for your attention to our work!

The error is loss = criterion(outputs, outputs_teacher). RuntimeError: Expected object of scalar type Long but got scalar type Float for argument.

torch.nn.CrossEntropyLoss() does not accept Float-type soft labels in the low version of PyTorch.

You can replace criterion = torch.nn.CrossEntropyLoss(reduction='mean') at line 127 in main.py with criterion = SoftTargetCrossEntropy().

We will fix the bug. Thank you!

wkcn avatar Oct 11 '22 04:10 wkcn

I have fixed the bug. Please update the code to the latest version. Thank you!

wkcn avatar Oct 11 '22 04:10 wkcn

Close the issue. Feel free to re-open it if any problem exists : )

wkcn avatar Oct 19 '22 03:10 wkcn