neural_sp
neural_sp copied to clipboard
bug: AttributeError: 'TransformerEncoder' object has no attribute 'chunk_size_left'
Hi, When I use streamming transformer to train the model, unit: word when I decode the result, using streaming_score.sh. I got an error:
Original utterance num: 2000
Removed 0 empty utterances
0%| | 0/2000 [00:00<?, ?it/s]Traceback (most recent call last):
File "/../../../neural_sp/bin/asr/eval.py", line 251, in <module>
main()
File "/../../../neural_sp/bin/asr/eval.py", line 182, in main
progressbar=True)
File "/neural_sp/evaluators/word.py", line 73, in eval_word
exclude_eos=True)
File "/neural_sp/models/seq2seq/speech2text.py", line 449, in decode_streaming
N_l = self.enc.chunk_size_left
File "/tools/venv/lib/python3.7/site-packages/torch/nn/modules/module.py", line 535, in __getattr__
type(self).__name__, name))
AttributeError: 'TransformerEncoder' object has no attribute 'chunk_size_left'
0%|
Yours, Thanks a lot.
@Gqwert123 Transformer is not supported in streaming_score.sh now. Could you try score.sh
instead?
@Gqwert123 Transformer is not supported in streaming_score.sh now. Could you try
score.sh
instead?
Thanks for your reply.
score.sh works fine.
But I have another question. when I use ctc/attention 0.3 decoding.
some insntance like below:
Ref: ni hao a wo zai zhe li
Hyp:ni hao a wo zai zhe li
but some insatnce may, like this: Ref: ni hao a wo zai zhe li Hyp:ni hao a wo zai zhe li wo you wo you wo you wo you This instance had a bad Alignment. when I use pure ctc decode, It's got better. I use streamming transformer to train my model.
@Gqwert123 That is the expected behaviour of Transformer. You can avoid it by using --length_penalty=2.0
.
@Gqwert123 That is the expected behaviour of Transformer. You can avoid it by using
--length_penalty=2.0
.
Sorry, but maybe it did not changed better. I use only AM, unit word, but not include spm. A lot instance got terrible like: Ref: ni hao a wo zai zhe li Hyp:ni hao a wo zai hi li wo you wo you wo you wo you nu nu nu ou
Maybe use some force alignment way? Or Truncate sentence output as long as input?
@Gqwert123 That is the expected behaviour of Transformer. You can avoid it by using
--length_penalty=2.0
.
I use AM config:
aishell/s5/conf/asr/lc_transformer_mocha_mono4H_chunk4H_chunk16_from4L_headdrop0.5_subsample8_96_64_32.yaml
decode config:
`model1=
model2=
model3=
model_bwd=
gpu=1
stdout=false
path to save preproecssed data
data=${H}/data #/n/work1/inaguma/corpus/aishell1
unit=word metric=edit_distance batch_size=1 beam_width=10 min_len_ratio=0.0 max_len_ratio=1.0 length_penalty=2.0 length_norm=true coverage_penalty=0.0 coverage_threshold=0.0 gnmt_decoding=false eos_threshold=1.0 lm= lm_second= lm_bwd= lm_weight=0.3 lm_second_weight=0.3 lm_bwd_weight=0.3 ctc_weight=0.6 # 1.0 for joint CTC-attention means decoding with CTC resolving_unk=false fwd_bwd_attention=false bwd_attention=false reverse_lm_rescoring=false`
@Gqwert123 Transformer is not supported in streaming_score.sh now. Could you try
score.sh
instead?
Did you mean that even I use streamming transformer model to train my model like aishell recipe (aishell/s5/conf/asr/lc_transformer_mocha_mono4H_chunk4H_chunk16_from4L_headdrop0.5_subsample8_96_64_32.yaml). Streamming decoding can be used now?
Basically CTC decoding can be used in the offline scenario. Length penalty is the easiest way to truncate bad hypotheses during streaming decoding.
@Gqwert123 Transformer is not supported in streaming_score.sh now. Could you try
score.sh
instead?
Hi,Does the current version support transformer streaming decoding?