snowfall
snowfall copied to clipboard
RuntimeError in ctc_att_transformer_train.py
See below (using the latest master)
2021-03-29 07:34:23,835 INFO [common.py:270] ================================================================================
2021-03-29 07:34:23,837 INFO [ctc_att_transformer_train.py:440] epoch 0, learning rate 0
Traceback (most recent call last):
File "./ctc_att_transformer_train.py", line 508, in <module>
main()
File "./ctc_att_transformer_train.py", line 442, in main
objf, valid_objf, global_batch_idx_train = train_one_epoch(dataloader=train_dl,
File "./ctc_att_transformer_train.py", line 220, in train_one_epoch
curr_batch_objf, curr_batch_frames, curr_batch_all_frames = get_objf(
File "./ctc_att_transformer_train.py", line 95, in get_objf
nnet_output, encoder_memory, memory_mask = model(feature, supervisions)
File "/ceph-fj/fangjun/py38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/root/fangjun/open-source/s2/snowfall/snowfall/models/transformer.py", line 92, in forward
encoder_memory, memory_mask = self.encode(x, supervision)
File "/root/fangjun/open-source/s2/snowfall/snowfall/models/transformer.py", line 108, in encode
x = self.encoder_embed(x)
File "/ceph-fj/fangjun/py38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/root/fangjun/open-source/s2/snowfall/snowfall/models/transformer.py", line 384, in forward
x = self.out(x.transpose(1, 2).contiguous().view(b, t, c * f))
File "/ceph-fj/fangjun/py38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/ceph-fj/fangjun/py38/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 93, in forward
return F.linear(input, self.weight, self.bias)
File "/ceph-fj/fangjun/py38/lib/python3.8/site-packages/torch/nn/functional.py", line 1692, in linear
output = input.matmul(weight.t())
RuntimeError: mat1 dim 1 must match mat2 dim 0
In https://github.com/k2-fsa/snowfall/pull/140, feature dimention is changed from 40 to 80. Replace num_features=40
with num_features=80
in https://github.com/k2-fsa/snowfall/blob/master/egs/librispeech/asr/simple_v1/ctc_att_transformer_train.py#L406 could fix it.
@zhu-han thanks
I think that we should consider a refactoring of snowfall to share more code between the recipes. The number of similar issues will grow exponentially as we start adding new recipes (new corpora or scripts with new training methods).
You are good at refactoring things. Perhaps you could work on that? It doesn't have to be super carefully done, we'll have further rounds of refactoring once we settle on that algorithms we'll be using. Right now we are a bit short-handed here, and I want to focus on reducing the WER (currently, RNNLMs are the main focus).
On Tue, Mar 30, 2021 at 12:07 AM Piotr Żelasko @.***> wrote:
I think that we should consider a refactoring of snowfall to share more code between the recipes. The number of similar issues will grow exponentially as we start adding new recipes (new corpora or scripts with new training methods).
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/k2-fsa/snowfall/issues/142#issuecomment-809510166, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAZFLO66P2V57BIYRKVODNDTGCQV5ANCNFSM4Z6RNG4Q .
I’m happy to do it, it’s actually been on my radar for some time now, but my time is also spread a bit thin lately. Anyway, I guess let’s just keep our eyes open for easy wins for now, once I can I’ll look into it more..