wenet RNN-T模型在stage 5没有输出

在Transducer模型训练好后，使用步骤5进行解码，得到的输出结果如下： Namespace(attn_weight=0.5, batch_size=32, beam_size=10, blank_penalty=0.0, checkpoint='exp/baseline/avg_4.pt', config='exp/baseline/train.yaml', context_bias_mode='', context_graph_score=0.0, context_list_path='', ctc_weight=0.5, data_type='raw', decoder_scale=0.0, decoding_chunk_size=-1, gpu=0, hlg='', length_penalty=0.0, lm_scale=0.0, modes=['rnnt_beam_search'], num_decoding_left_chunks=-1, override_config=[], r_decoder_scale=0.0, result_dir='exp/baseline', reverse_weight=0.0, search_ctc_weight=0.3, search_transducer_weight=0.7, simulate_streaming=False, test_data='data/test/data.list', transducer_weight=0.5, word='') 2024-02-03 20:48:23,634 INFO use char tokenizer 2024-02-03 20:48:24,120 INFO Checkpoint: loading from checkpoint exp/baseline/avg_4.pt {'accum_grad': 1, 'cmvn': 'global_cmvn', 'cmvn_conf': {'cmvn_file': 'data/train/global_cmvn', 'is_json_cmvn': True}, 'ctc': 'ctc', 'ctc_conf': {'ctc_blank_id': 0}, 'dataset': 'asr', 'dataset_conf': {'batch_conf': {'batch_size': 4, 'batch_type': 'static'}, 'fbank_conf': {'dither': 0.1, 'frame_length': 25, 'frame_shift': 10, 'num_mel_bins': 80}, 'filter_conf': {'max_length': 40960, 'min_length': 10, 'token_max_length': 200, 'token_min_length': 1}, 'resample_conf': {'resample_rate': 16000}, 'shuffle': True, 'shuffle_conf': {'shuffle_size': 1500}, 'sort': True, 'sort_conf': {'sort_size': 500}, 'spec_aug': True, 'spec_aug_conf': {'max_f': 10, 'max_t': 50, 'num_f_mask': 2, 'num_t_mask': 2}, 'speed_perturb': True}, 'decoder': 'bitransformer', 'decoder_conf': {'attention_heads': 4, 'dropout_rate': 0.1, 'linear_units': 2048, 'num_blocks': 3, 'positional_dropout_rate': 0.1, 'r_num_blocks': 3, 'self_attention_dropout_rate': 0.1, 'src_attention_dropout_rate': 0.1}, 'dtype': 'fp32', 'encoder': 'conformer', 'encoder_conf': {'activation_type': 'swish', 'attention_dropout_rate': 0.1, 'attention_heads': 4, 'causal': True, 'cnn_module_kernel': 8, 'cnn_module_norm': 'layer_norm', 'dropout_rate': 0.1, 'input_layer': 'conv2d', 'linear_units': 2048, 'normalize_before': True, 'num_blocks': 12, 'output_size': 256, 'pos_enc_layer_type': 'rel_pos', 'positional_dropout_rate': 0.1, 'selfattention_layer_type': 'rel_selfattn', 'use_cnn_module': True, 'use_dynamic_chunk': True, 'use_dynamic_left_chunk': False}, 'grad_clip': 4, 'input_dim': 80, 'joint': 'transducer_joint', 'joint_conf': {'activation': 'tanh', 'enc_output_size': 256, 'join_dim': 512, 'joint_mode': 'add', 'postjoin_linear': False, 'pred_output_size': 256, 'prejoin_linear': True}, 'log_interval': 100, 'max_epoch': 40, 'model': 'transducer', 'model_conf': {'attention_weight': 0.15, 'ctc_weight': 0.1, 'length_normalized_loss': False, 'lsm_weight': 0.1, 'reverse_weight': 0.3, 'transducer_weight': 0.75}, 'model_dir': 'exp/baseline', 'optim': 'adam', 'optim_conf': {'lr': 0.001}, 'output_dim': 4233, 'predictor': 'rnn', 'predictor_conf': {'bias': True, 'dropout': 0.1, 'embed_dropout': 0.1, 'embed_size': 256, 'hidden_size': 256, 'num_layers': 2, 'output_size': 256, 'rnn_type': 'lstm'}, 'save_states': 'model_only', 'scheduler': 'warmuplr', 'scheduler_conf': {'warmup_steps': 25000}, 'tokenizer': 'char', 'tokenizer_conf': {'bpe_path': None, 'is_multilingual': False, 'non_lang_syms_path': None, 'num_languages': 1, 'special_tokens': {'': 0, '': 2, '': 2, '': 1}, 'split_with_space': False, 'symbol_table_path': 'data/dict/lang_char.txt'}, 'train_engine': 'torch_ddp', 'use_amp': False, 'vocab_size': 4233, 'init_infos': {}} 2024-02-03 20:48:24,736 INFO blank_id is 0

没有任何报错提示，在exp文件夹下text和wer里面都是空白

Feb 03 '24 12:02 DaobinZhu

发现transducer.py里面没有decode()方法

Feb 03 '24 13:02 DaobinZhu

wenet解码重构了， transducer还没有来的及完成，可以参考下之前的实现

Feb 04 '24 01:02 Mddct

好的👌🏻

---原始邮件--- 发件人: "Dinahao @.> 发送时间: 2024年2月4日(周日) 上午9:10 收件人: @.>; 抄送: "Daobin @.@.>; 主题: Re: [wenet-e2e/wenet] RNN-T模型在stage 5没有输出 (Issue #2339)

wenet解码重构了， transducer还没有来的及完成，可以参考下之前的实现

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

Feb 04 '24 01:02 DaobinZhu

改了之前的解码部分，发现可以解码了，但是输出结果都是： 2024-02-06 21:01:36,864 INFO blank_id is 0 2024-02-06 21:01:38,422 INFO BAC009S0764W0121 搜狐娱乐讯据台湾媒体报道 2024-02-06 21:01:39,358 INFO BAC009S0764W0122 搜狐娱乐讯据台湾媒体报道 2024-02-06 21:01:40,155 INFO BAC009S0764W0123 搜狐娱乐讯据台湾媒体报道 2024-02-06 21:01:41,196 INFO BAC009S0764W0124 搜狐娱乐讯据台湾媒体报道 2024-02-06 21:01:42,061 INFO BAC009S0764W0125 搜狐娱乐讯据台湾媒体报道 2024-02-06 21:01:42,740 INFO BAC009S0764W0126 搜狐娱乐讯据台湾媒体报道 2024-02-06 21:01:43,599 INFO BAC009S0764W0127 搜狐娱乐讯据台湾媒体报道 2024-02-06 21:01:44,328 INFO BAC009S0764W0128 搜狐娱乐讯据台湾媒体报道 2024-02-06 21:01:45,042 INFO BAC009S0764W0129 搜狐娱乐讯据台湾媒体报道 2024-02-06 21:01:46,119 INFO BAC009S0764W0130 搜狐娱乐讯据台湾媒体报道

debug发现每个speech变量都是一个，现在的dataset和之前的dataset实现貌似有一些不同

Feb 06 '24 13:02 DaobinZhu

现在是dict是inplace操作在原有dict上新增或者update，年后等w2vbert参数移植过来会完善和简化rnnt

Feb 06 '24 21:02 Mddct

谢谢周佬

---原始邮件--- 发件人: "Dinghao @.> 发送时间: 2024年2月7日(周三) 凌晨5:27 收件人: @.>; 抄送: "Daobin @.@.>; 主题: Re: [wenet-e2e/wenet] RNN-T模型在stage 5没有输出 (Issue #2339)

现在是dict是inplace操作在原有dict上新增或者update，年后等w2vbert参数移植过来会完善和简化rnnt

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

Feb 07 '24 01:02 DaobinZhu

wenet wenet copied to clipboard

RNN-T模型在stage 5没有输出

wenet
wenet copied to clipboard