VAC_CSLR icon indicating copy to clipboard operation
VAC_CSLR copied to clipboard

Question about CPU or GPU error

Open chunguangqu opened this issue 3 years ago • 5 comments

I ran your code and found the following error, where are the parameters put into the GPU?

Traceback (most recent call last): File "main.py", line 218, in processor.start() File "main.py", line 46, in start seq_train(self.data_loader['train'], self.model, self.optimizer,self.device, epoch, self.recoder) File "/home/quchunguang/sunday/CSLR/seq_scripts.py", line 24, in seq_train loss = model.criterion_calculation(ret_dict, label, label_lgt) File "/home/quchunguang/sunday/CSLR/slr_network.py", line 96, in criterion_calculation label_lgt.cpu().int()).mean() File "/home/quchunguang/anaconda3/envs/tf/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call result = self.forward(*input, **kwargs) File "/home/quchunguang/anaconda3/envs/tf/lib/python3.6/site-packages/torch/nn/modules/loss.py", line 1295, in forward self.zero_infinity) File "/home/quchunguang/anaconda3/envs/tf/lib/python3.6/site-packages/torch/nn/functional.py", line 1767, in ctc_loss zero_infinity) RuntimeError: Tensor for argument #2 'targets' is on CPU, but expected it to be on GPU (while checking arguments for ctc_loss_gpu)

chunguangqu avatar Jan 21 '22 07:01 chunguangqu

I haven't met this problem before. It seems like it is about switches between native and cudnn in earlier discussion about pytorch discussion

The recent version of pytorch adopt two different ways for different backends: For native: checkAllSameGPU(c, {log_probs_arg, targets_arg}); For cudnn: checkBackend(c, {*log_probs}, Backend::CUDA); checkBackend(c, {*targets}, Backend::CPU);

You can debug based on your pytorch version and the backend used (torch.backends.cudnn.enabled).

ycmin95 avatar Jan 21 '22 08:01 ycmin95

what is your pytorch version?Thanks

chunguangqu avatar Jan 24 '22 02:01 chunguangqu

1.10.0 and 1.10.1 works for me.

ycmin95 avatar Jan 24 '22 03:01 ycmin95

My pytorch is also 1.10.0,What is your ctcdecode version? When I run main.py, the following error occurs: (your) (base) quchunguang@ubuntu:~/sunday/CSLR$ python main.py Loading model Traceback (most recent call last): File "main.py", line 207, in processor = Processor(args) File "main.py", line 33, in init self.model, self.optimizer = self.loading() File "main.py", line 96, in loading loss_weights=self.arg.loss_weights, File "/home/quchunguang/sunday/CSLR/slr_network.py", line 38, in init self.decoder = utils.Decode(gloss_dict, num_classes, 'beam') File "/home/quchunguang/sunday/CSLR/utils/decode.py", line 19, in init self.ctc_decoder = ctcdecode.CTCBeamDecoder(vocab, beam_width=10, blank_id=blank_id, AttributeError: module 'ctcdecode' has no attribute 'CTCBeamDecoder'

chunguangqu avatar Jan 24 '22 03:01 chunguangqu

The version has been presented in Readme, it seems like you did not install ctcdecoder successfully.

ycmin95 avatar Jan 24 '22 03:01 ycmin95