edgedict icon indicating copy to clipboard operation
edgedict copied to clipboard

Training Error

Open Rajratnpranesh opened this issue 3 years ago • 0 comments

I am running the train.py. Following is the error. I digged in and fount that the input to the norm-layer in model.py is not of the correct dimension. There is places where input dimension are swapped. I tried fixing it but then other parameters got wrong. Can you suggest a fix? I have installed everything as per the README.md. I guess the code in the models.py is needed to be fixed. Please let me know.

Ty

Traceback (most recent call last): File "/content/edgedict/train.py", line 385, in <module> app.run(main) File "/usr/local/lib/python3.7/dist-packages/absl/app.py", line 303, in run _run_main(main, args) File "/usr/local/lib/python3.7/dist-packages/absl/app.py", line 251, in _run_main sys.exit(main(argv)) File "/content/edgedict/train.py", line 381, in main trainer.train(start_step=step) File "/content/edgedict/train.py", line 177, in train val_loss, wer, pred_seqs, true_seqs = self.evaluate() File "/content/edgedict/train.py", line 282, in evaluate loss, wer, pred_seq, true_seq = self.evaluate_step(batch) File "/content/edgedict/train.py", line 309, in evaluate_step loss = self.model(xs, ys, xlen, ylen) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 532, in __call__ result = self.forward(*input, **kwargs) File "/content/edgedict/rnnt/models.py", line 232, in forward h_enc, _ = self.encoder(xs) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 532, in __call__ result = self.forward(*input, **kwargs) File "/content/edgedict/rnnt/models.py", line 132, in forward xs = self.norm(xs) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 532, in __call__ result = self.forward(*input, **kwargs) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/normalization.py", line 153, in forward input, self.normalized_shape, self.weight, self.bias, self.eps) File "/usr/local/lib/python3.7/dist-packages/apex/amp/wrap.py", line 28, in wrapper return orig_fn(*new_args, **kwargs) File "/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py", line 1696, in layer_norm torch.backends.cudnn.enabled) RuntimeError: Given normalized_shape=[128], expected input with shape [*, 128], but got input of size[2, 128, 126]

Rajratnpranesh avatar Jul 22 '21 07:07 Rajratnpranesh