speechbrain Add WFST decoding based on k2 for speechbrain

The aim of this PR is to add WFST decoding based on k2 for speechbrain. In fact, we have already talk about how to integrate k2 into speechbrain, such as the issue #852, the PR #917 and #922.

Our k2 WFST decoding is implemented based on the nnet output from the acoustic model trained by speechbrain.

This script is fully based on python except the run.sh. The run.sh can help us run the whole process with only one command. In this PR, I also show two topos (ctc and hlg) for wfst decoding based on k2.

With the continuous improvement and enrichment of k2-based WFST decoding algorithms, we will update this WFST decoding in speechbrain synchronously.

My current results: CUDA_VISIBLE_DEVICES='0' python3 test_ctc.py

        | test-clean | test-other
--------------------------------
 WER(%) |    5.88    |   13.82

CUDA_VISIBLE_DEVICES='0' python3 test_hlg.py

           |          WER(%)
-------------------------------------
  lm_scale | test-clean | test-other
-------------------------------------
    0.3    |    4.76    |   10.93
-------------------------------------
    0.4    |    4.75    |   10.83
-------------------------------------
    0.5    |    4.83    |   10.89
-------------------------------------

Sep 27 '21 12:09 luomingshuang

Thank you for the updates! It look like now, the performance with ctc +wfst is better than the one with ctc only. I'm wondering if we have now to close the other PRs on the WFST integration. @aheba could you please take a look at this?

Sep 27 '21 13:09 mravanelli

I think we can close the other PRs on the WFST integration. Because this PR is fully based on python. So this PR can match speechbrain better. We will also update the WFST integration based on this PR.

Thank you for the updates! It look like now, the performance with ctc +wfst is better than the one with ctc only. I'm wondering if we have now to close the other PRs on the WFST integration. @aheba could you please take a look at this?

Sep 27 '21 15:09 luomingshuang

@luomingshuang ; We start working on your PR, thank you very much for the integration,

Just to understand, why you still call the graph HLG ? and not TLG ?,

I saw you delete the move of the compile + decode codes into our libs (speechbrain/wfst).. generally we separate the recipes and the patterns..

There is some incoming commits:

[ ] supports our CategoricalLabelEncoder in addition to the sentencepiece tokeniser.
[ ] add some documentation for kaldilm + kaldialign deps.
[ ] convert sh scripts run.sh into pythonic/hyperyaml use.

Nov 05 '21 10:11 aheba

@luomingshuang , I just run the full exp, and saw:

Test-clean:
%WER 2005.99 [ 52557 / 2620, 49941 ins, 0 del, 2616 sub ]
%SER 100.00 [ 2620 / 2620 ]
Test-others:
Test-others:
%WER 2005.99 [ 52557 / 2620, 49941 ins, 0 del, 2616 sub ]
%SER 100.00 [ 2620 / 2620 ]

it seems that the model produce only <eps>

I'm using: pytorch=1.10; k2(torch.1.10cu10.2)

Nov 05 '21 12:11 aheba

@luomingshuang , I just run the full exp, and saw:
Test-clean:
%WER 2005.99 [ 52557 / 2620, 49941 ins, 0 del, 2616 sub ]
%SER 100.00 [ 2620 / 2620 ]
Test-others:
Test-others:
%WER 2005.99 [ 52557 / 2620, 49941 ins, 0 del, 2616 sub ]
%SER 100.00 [ 2620 / 2620 ]
it seems that the model produce only <eps>

I'm using: pytorch=1.10; k2(torch.1.10cu10.2)

This is with test_ctc.py script

Nov 05 '21 12:11 aheba

Just to understand, why you still call the graph HLG ? and not TLG ?,

Please see the comment from @danpovey in https://github.com/k2-fsa/snowfall/issues/121#issuecomment-793314492

I'd rather call it HLG because that's a more widely known terminology.

Nov 05 '21 12:11 csukuangfj

@luomingshuang , I just run the full exp, and saw:
Test-clean:
%WER 2005.99 [ 52557 / 2620, 49941 ins, 0 del, 2616 sub ]
%SER 100.00 [ 2620 / 2620 ]
Test-others:
Test-others:
%WER 2005.99 [ 52557 / 2620, 49941 ins, 0 del, 2616 sub ]
%SER 100.00 [ 2620 / 2620 ]
it seems that the model produce only <eps>

I'm using: pytorch=1.10; k2(torch.1.10cu10.2)

Em....ok, I see. Our k2 have some new versions. I am not sure it is due to this reason. I will check this PR with the newest version k2 again.

Nov 05 '21 12:11 luomingshuang

@luomingshuang , I just run the full exp, and saw:
Test-clean:
%WER 2005.99 [ 52557 / 2620, 49941 ins, 0 del, 2616 sub ]
%SER 100.00 [ 2620 / 2620 ]
Test-others:
Test-others:
%WER 2005.99 [ 52557 / 2620, 49941 ins, 0 del, 2616 sub ]
%SER 100.00 [ 2620 / 2620 ]
it seems that the model produce only <eps> I'm using: pytorch=1.10; k2(torch.1.10cu10.2)
Em....ok, I see. Our k2 have some new versions. I am not sure it is due to this reason. I will check this PR with the newest version k2 again.

I will switch to 1.8 to double check form my side

Nov 05 '21 12:11 aheba

You may have to delete and regenerate any cached graphs like HLG.fst and maybe even L.fst. I suspect the issue is we changed, at some point, one of the attributes on one of those graphs from linear to ragged, and the scripts were changed, and I think the effect can be that epsilons get retained if the old graph is not removed. But, Minshuang, check that the decoding script is up to date. Compare with similar scripts in icefall. Adding some_ragged_tensor = some_ragged_tensor.remove_values_leq(0) when you are getting the transcripts, would likely fix this.

On Fri, Nov 5, 2021 at 8:50 PM A.HEBA @.***> wrote:

@luomingshuang https://github.com/luomingshuang , I just run the full exp, and saw:

Test-clean: %WER 2005.99 [ 52557 / 2620, 49941 ins, 0 del, 2616 sub ] %SER 100.00 [ 2620 / 2620 ] Test-others: Test-others: %WER 2005.99 [ 52557 / 2620, 49941 ins, 0 del, 2616 sub ] %SER 100.00 [ 2620 / 2620 ]

it seems that the model produce only I'm using: pytorch=1.10; k2(torch.1.10cu10.2)

Em....ok, I see. Our k2 have some new versions. I am not sure it is due to this reason. I will check this PR with the newest version k2 again.

I will switch to 1.8 to double check form my side

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/speechbrain/speechbrain/pull/1014#issuecomment-961870535, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAZFLO4EKG3445CTYYHFOBLUKPHLDANCNFSM5E2PM33Q . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

Nov 05 '21 12:11 danpovey

I will update this PR soon.

Nov 05 '21 13:11 luomingshuang

@luomingshuang , I just run the full exp, and saw:
Test-clean:
%WER 2005.99 [ 52557 / 2620, 49941 ins, 0 del, 2616 sub ]
%SER 100.00 [ 2620 / 2620 ]
Test-others:
Test-others:
%WER 2005.99 [ 52557 / 2620, 49941 ins, 0 del, 2616 sub ]
%SER 100.00 [ 2620 / 2620 ]
it seems that the model produce only <eps> I'm using: pytorch=1.10; k2(torch.1.10cu10.2)
This is with test_ctc.py script

@aheba , I just replace the local/download_lm.py and I re-run my original PR scripts without any other changes. And the results are as I shown.

(k2-python) luomingshuang@de-74279-k2-train-4-0809194600-65b6c64f5-jh22b:/ceph-meixu/luomingshuang/speechbrain/recipes/LibriSpeech/ASR/wfst$ bash run.sh 
2021-11-05 21:21:17 (run.sh:45:main) dl_dir: /ceph-meixu/luomingshuang/speechbrain/recipes/LibriSpeech/ASR/wfst/download
2021-11-05 21:21:17 (run.sh:48:main) stage 0: Download LM
2021-11-05 21:21:32,092 INFO [download_lm.py:91] out_dir: /ceph-meixu/luomingshuang/speechbrain/recipes/LibriSpeech/ASR/wfst/download/lm
Downloading LibriSpeech LM files:   0%|                                           | 0/4 [00:00<?, ?it/s]2021-11-05 21:21:32,165 INFO [download_lm.py:71] /ceph-meixu/luomingshuang/speechbrain/recipes/LibriSpeech/ASR/wfst/download/lm/3-gram.pruned.1e-7.arpa.gz already exists - skipping
2021-11-05 21:21:32,173 INFO [download_lm.py:80] /ceph-meixu/luomingshuang/speechbrain/recipes/LibriSpeech/ASR/wfst/download/lm/3-gram.pruned.1e-7.arpa already exist - skipping
2021-11-05 21:21:32,180 INFO [download_lm.py:71] /ceph-meixu/luomingshuang/speechbrain/recipes/LibriSpeech/ASR/wfst/download/lm/4-gram.arpa.gz already exists - skipping
2021-11-05 21:21:32,187 INFO [download_lm.py:80] /ceph-meixu/luomingshuang/speechbrain/recipes/LibriSpeech/ASR/wfst/download/lm/4-gram.arpa already exist - skipping
2021-11-05 21:21:32,194 INFO [download_lm.py:71] /ceph-meixu/luomingshuang/speechbrain/recipes/LibriSpeech/ASR/wfst/download/lm/librispeech-vocab.txt already exists - skipping
2021-11-05 21:21:32,202 INFO [download_lm.py:71] /ceph-meixu/luomingshuang/speechbrain/recipes/LibriSpeech/ASR/wfst/download/lm/librispeech-lexicon.txt already exists - skipping
Downloading LibriSpeech LM files: 100%|███████████████████████████████████| 4/4 [00:00<00:00, 91.65it/s]
2021-11-05 21:21:32 (run.sh:54:main) Stage 1: Download AM
2021-11-05 21:21:48,178 INFO [download_am.py:67] out_dir: /ceph-meixu/luomingshuang/speechbrain/recipes/LibriSpeech/ASR/wfst/download/am
2021-11-05 21:21:48,185 INFO [fetching.py:107] Fetch hyperparams.yaml: Delegating to Huggingface hub, source speechbrain/asr-transformer-transformerlm-librispeech.
2021-11-05 21:21:50,580 INFO [filelock.py:274] Lock 139931527736144 acquired on /ceph-meixu/luomingshuang/.cache/huggingface/hub/b1810802561036cde8b27031d24a4fabb980a2ac136e79a9ec0d79cd9e309096.815af24e361a3166ec3e9e85966dba6f4aea895d10d8a5e3a6c156c0e3d30559.lock
Downloading: 100%|█████████████████████████████████████████████████| 4.53k/4.53k [00:00<00:00, 3.86MB/s]
2021-11-05 21:21:52,827 INFO [filelock.py:318] Lock 139931527736144 released on /ceph-meixu/luomingshuang/.cache/huggingface/hub/b1810802561036cde8b27031d24a4fabb980a2ac136e79a9ec0d79cd9e309096.815af24e361a3166ec3e9e85966dba6f4aea895d10d8a5e3a6c156c0e3d30559.lock
2021-11-05 21:22:00,031 INFO [fetching.py:107] Fetch normalizer.ckpt: Delegating to Huggingface hub, source speechbrain/asr-transformer-transformerlm-librispeech.
2021-11-05 21:22:02,315 INFO [fetching.py:107] Fetch asr.ckpt: Delegating to Huggingface hub, source speechbrain/asr-transformer-transformerlm-librispeech.
2021-11-05 21:22:04,579 INFO [fetching.py:107] Fetch lm.ckpt: Delegating to Huggingface hub, source speechbrain/asr-transformer-transformerlm-librispeech.
2021-11-05 21:22:06,834 INFO [fetching.py:107] Fetch tokenizer.ckpt: Delegating to Huggingface hub, source speechbrain/asr-transformer-transformerlm-librispeech.
2021-11-05 21:22:09,096 INFO [parameter_transfer.py:196] Loading pretrained files for: normalizer, asr, lm, tokenizer
2021-11-05 21:22:28,514 INFO [download_am.py:56] Download AM files successful!
2021-11-05 21:22:28 (run.sh:60:main) Stage 2: Prepare BPE based lang
2021-11-05 21:22:55 (run.sh:75:main) Stage 3: Prepare G
/tmp/pip-install-ptlhohcr/kaldilm_109aee31c2d342cf9e4bd109c9602d08/kaldilm/csrc/arpa_file_parser.cc:void kaldilm::ArpaFileParser::Read(std::istream&):79
[I] Reading \data\ section.
/tmp/pip-install-ptlhohcr/kaldilm_109aee31c2d342cf9e4bd109c9602d08/kaldilm/csrc/arpa_file_parser.cc:void kaldilm::ArpaFileParser::Read(std::istream&):140
[I] Reading \1-grams: section.
/tmp/pip-install-ptlhohcr/kaldilm_109aee31c2d342cf9e4bd109c9602d08/kaldilm/csrc/arpa_file_parser.cc:void kaldilm::ArpaFileParser::Read(std::istream&):140
[I] Reading \2-grams: section.
/tmp/pip-install-ptlhohcr/kaldilm_109aee31c2d342cf9e4bd109c9602d08/kaldilm/csrc/arpa_file_parser.cc:void kaldilm::ArpaFileParser::Read(std::istream&):140
[I] Reading \3-grams: section.
/tmp/pip-install-ptlhohcr/kaldilm_109aee31c2d342cf9e4bd109c9602d08/kaldilm/csrc/arpa_file_parser.cc:void kaldilm::ArpaFileParser::Read(std::istream&):79
[I] Reading \data\ section.
/tmp/pip-install-ptlhohcr/kaldilm_109aee31c2d342cf9e4bd109c9602d08/kaldilm/csrc/arpa_file_parser.cc:void kaldilm::ArpaFileParser::Read(std::istream&):140
[I] Reading \1-grams: section.
/tmp/pip-install-ptlhohcr/kaldilm_109aee31c2d342cf9e4bd109c9602d08/kaldilm/csrc/arpa_file_parser.cc:void kaldilm::ArpaFileParser::Read(std::istream&):140
[I] Reading \2-grams: section.
/tmp/pip-install-ptlhohcr/kaldilm_109aee31c2d342cf9e4bd109c9602d08/kaldilm/csrc/arpa_file_parser.cc:void kaldilm::ArpaFileParser::Read(std::istream&):140
[I] Reading \3-grams: section.
/tmp/pip-install-ptlhohcr/kaldilm_109aee31c2d342cf9e4bd109c9602d08/kaldilm/csrc/arpa_file_parser.cc:void kaldilm::ArpaFileParser::Read(std::istream&):140
[I] Reading \4-grams: section.
2021-11-05 21:37:27 (run.sh:100:main) Stage 4: Compile HLG
2021-11-05 21:37:51,915 INFO [compile_hlg.py:142] Processing data/lang_bpe
2021-11-05 21:37:52,313 INFO [lexicon.py:116] Converting L.pt to Linv.pt
2021-11-05 21:37:53,799 INFO [compile_hlg.py:64] Building ctc_topo. max_token_id: 4999
2021-11-05 21:37:54,733 INFO [compile_hlg.py:73] Loading G_3_gram.fst.txt
2021-11-05 21:38:09,374 INFO [compile_hlg.py:84] Intersecting L and G
2021-11-05 21:38:21,602 INFO [compile_hlg.py:86] LG shape: (6485497, None)
2021-11-05 21:38:21,602 INFO [compile_hlg.py:88] Connecting LG
2021-11-05 21:38:21,602 INFO [compile_hlg.py:90] LG shape after k2.connect: (6485497, None)
2021-11-05 21:38:21,602 INFO [compile_hlg.py:92] <class 'torch.Tensor'>
2021-11-05 21:38:21,602 INFO [compile_hlg.py:93] Determinizing LG
2021-11-05 21:38:44,318 INFO [compile_hlg.py:96] <class '_k2.ragged.RaggedTensor'>
2021-11-05 21:38:44,318 INFO [compile_hlg.py:98] Connecting LG after k2.determinize
2021-11-05 21:38:44,318 INFO [compile_hlg.py:101] Removing disambiguation symbols on LG
2021-11-05 21:39:51,793 INFO [compile_hlg.py:109] LG shape after k2.remove_epsilon: (4216099, None)
2021-11-05 21:39:58,426 INFO [compile_hlg.py:114] Arc sorting LG
2021-11-05 21:39:58,426 INFO [compile_hlg.py:117] Composing H and LG
2021-11-05 21:41:44,032 INFO [compile_hlg.py:124] Connecting LG
2021-11-05 21:41:44,033 INFO [compile_hlg.py:127] Arc sorting LG
2021-11-05 21:42:07,384 INFO [compile_hlg.py:129] HLG.shape: (3944008, None)
2021-11-05 21:42:07,522 INFO [compile_hlg.py:145] Saving HLG.pt to data/lang_bpe
2021-11-05 21:43:25 (run.sh:107:main) Stage 5: Decoding based on k2
2021-11-05 21:44:05,858 INFO [test_hlg.py:383] Decoding started
2021-11-05 21:44:05,858 INFO [test_hlg.py:384] {'exp_dir': PosixPath('results'), 'lang_dir': PosixPath('data/lang_bpe'), 'lm_dir': PosixPath('data/lm'), 'search_beam': 20, 'output_beam': 5, 'min_active_states': 30, 'max_active_states': 10000, 'use_double_scores': True, 'epoch': 19, 'avg': 5, 'method': 'whole-lattice-rescoring', 'num_paths': 100, 'lattice_score_scale': 0.5, 'export': False}
2021-11-05 21:44:06,260 INFO [lexicon.py:113] Loading pre-compiled data/lang_bpe/Linv.pt
2021-11-05 21:44:06,396 INFO [test_hlg.py:393] device: cuda:0
2021-11-05 21:44:27,310 INFO [test_hlg.py:406] Loading G_4_gram.fst.txt
2021-11-05 21:44:27,310 WARNING [test_hlg.py:407] It may take 8 minutes.
2021-11-05 22:15:50,699 INFO [fetching.py:80] Fetch hyperparams.yaml: Using existing file/symlink in download/am/hyperparams.yaml.
2021-11-05 22:15:59,488 INFO [fetching.py:80] Fetch normalizer.ckpt: Using existing file/symlink in download/am/normalizer.ckpt.
2021-11-05 22:15:59,488 INFO [fetching.py:80] Fetch asr.ckpt: Using existing file/symlink in download/am/asr.ckpt.
2021-11-05 22:15:59,489 INFO [fetching.py:80] Fetch lm.ckpt: Using existing file/symlink in download/am/lm.ckpt.
2021-11-05 22:15:59,489 INFO [fetching.py:80] Fetch tokenizer.ckpt: Using existing file/symlink in download/am/tokenizer.ckpt.
2021-11-05 22:15:59,490 INFO [parameter_transfer.py:196] Loading pretrained files for: normalizer, asr, lm, tokenizer
100%|███████████████████████████████████████████████████████████████| 2620/2620 [14:58<00:00,  2.92it/s]
2021-11-05 22:31:03,065 INFO [test_hlg.py:341] The transcripts are stored in results/recogs-test-clean-lm_scale_0.1.txt
2021-11-05 22:31:03,134 INFO [utils.py:449] [test-clean-lm_scale_0.1] %WER 4.97% [2612 / 52576, 275 ins, 514 del, 1823 sub ]
2021-11-05 22:31:03,310 INFO [test_hlg.py:350] Wrote detailed error stats to results/errs-test-clean-lm_scale_0.1.txt
2021-11-05 22:31:03,330 INFO [test_hlg.py:341] The transcripts are stored in results/recogs-test-clean-lm_scale_0.2.txt
2021-11-05 22:31:03,395 INFO [utils.py:449] [test-clean-lm_scale_0.2] %WER 4.83% [2542 / 52576, 261 ins, 526 del, 1755 sub ]
2021-11-05 22:31:03,572 INFO [test_hlg.py:350] Wrote detailed error stats to results/errs-test-clean-lm_scale_0.2.txt
2021-11-05 22:31:03,592 INFO [test_hlg.py:341] The transcripts are stored in results/recogs-test-clean-lm_scale_0.3.txt
2021-11-05 22:31:03,656 INFO [utils.py:449] [test-clean-lm_scale_0.3] %WER 4.76% [2505 / 52576, 242 ins, 558 del, 1705 sub ]
2021-11-05 22:31:03,829 INFO [test_hlg.py:350] Wrote detailed error stats to results/errs-test-clean-lm_scale_0.3.txt
2021-11-05 22:31:03,849 INFO [test_hlg.py:341] The transcripts are stored in results/recogs-test-clean-lm_scale_0.4.txt
2021-11-05 22:31:03,914 INFO [utils.py:449] [test-clean-lm_scale_0.4] %WER 4.75% [2496 / 52576, 228 ins, 612 del, 1656 sub ]
2021-11-05 22:31:04,157 INFO [test_hlg.py:350] Wrote detailed error stats to results/errs-test-clean-lm_scale_0.4.txt
2021-11-05 22:31:04,177 INFO [test_hlg.py:341] The transcripts are stored in results/recogs-test-clean-lm_scale_0.5.txt
2021-11-05 22:31:04,243 INFO [utils.py:449] [test-clean-lm_scale_0.5] %WER 4.83% [2537 / 52576, 220 ins, 678 del, 1639 sub ]
2021-11-05 22:31:04,418 INFO [test_hlg.py:350] Wrote detailed error stats to results/errs-test-clean-lm_scale_0.5.txt
2021-11-05 22:31:04,439 INFO [test_hlg.py:341] The transcripts are stored in results/recogs-test-clean-lm_scale_0.6.txt
2021-11-05 22:31:04,503 INFO [utils.py:449] [test-clean-lm_scale_0.6] %WER 4.95% [2605 / 52576, 200 ins, 792 del, 1613 sub ]
2021-11-05 22:31:04,681 INFO [test_hlg.py:350] Wrote detailed error stats to results/errs-test-clean-lm_scale_0.6.txt
2021-11-05 22:31:04,701 INFO [test_hlg.py:341] The transcripts are stored in results/recogs-test-clean-lm_scale_0.7.txt
2021-11-05 22:31:04,765 INFO [utils.py:449] [test-clean-lm_scale_0.7] %WER 5.15% [2709 / 52576, 187 ins, 940 del, 1582 sub ]
2021-11-05 22:31:04,938 INFO [test_hlg.py:350] Wrote detailed error stats to results/errs-test-clean-lm_scale_0.7.txt
2021-11-05 22:31:04,958 INFO [test_hlg.py:341] The transcripts are stored in results/recogs-test-clean-lm_scale_0.8.txt
2021-11-05 22:31:05,023 INFO [utils.py:449] [test-clean-lm_scale_0.8] %WER 5.52% [2904 / 52576, 168 ins, 1168 del, 1568 sub ]
2021-11-05 22:31:05,201 INFO [test_hlg.py:350] Wrote detailed error stats to results/errs-test-clean-lm_scale_0.8.txt
2021-11-05 22:31:05,221 INFO [test_hlg.py:341] The transcripts are stored in results/recogs-test-clean-lm_scale_0.9.txt
2021-11-05 22:31:05,286 INFO [utils.py:449] [test-clean-lm_scale_0.9] %WER 6.03% [3172 / 52576, 148 ins, 1452 del, 1572 sub ]
2021-11-05 22:31:05,459 INFO [test_hlg.py:350] Wrote detailed error stats to results/errs-test-clean-lm_scale_0.9.txt
2021-11-05 22:31:05,479 INFO [test_hlg.py:341] The transcripts are stored in results/recogs-test-clean-lm_scale_1.0.txt
2021-11-05 22:31:05,543 INFO [utils.py:449] [test-clean-lm_scale_1.0] %WER 6.64% [3489 / 52576, 139 ins, 1756 del, 1594 sub ]
2021-11-05 22:31:05,779 INFO [test_hlg.py:350] Wrote detailed error stats to results/errs-test-clean-lm_scale_1.0.txt
2021-11-05 22:31:05,800 INFO [test_hlg.py:341] The transcripts are stored in results/recogs-test-clean-lm_scale_1.1.txt
2021-11-05 22:31:05,865 INFO [utils.py:449] [test-clean-lm_scale_1.1] %WER 7.35% [3865 / 52576, 130 ins, 2131 del, 1604 sub ]
2021-11-05 22:31:06,041 INFO [test_hlg.py:350] Wrote detailed error stats to results/errs-test-clean-lm_scale_1.1.txt
2021-11-05 22:31:06,060 INFO [test_hlg.py:341] The transcripts are stored in results/recogs-test-clean-lm_scale_1.2.txt
2021-11-05 22:31:06,128 INFO [utils.py:449] [test-clean-lm_scale_1.2] %WER 8.08% [4250 / 52576, 119 ins, 2513 del, 1618 sub ]
2021-11-05 22:31:06,302 INFO [test_hlg.py:350] Wrote detailed error stats to results/errs-test-clean-lm_scale_1.2.txt
2021-11-05 22:31:06,321 INFO [test_hlg.py:341] The transcripts are stored in results/recogs-test-clean-lm_scale_1.3.txt
2021-11-05 22:31:06,387 INFO [utils.py:449] [test-clean-lm_scale_1.3] %WER 8.77% [4611 / 52576, 115 ins, 2882 del, 1614 sub ]
2021-11-05 22:31:06,565 INFO [test_hlg.py:350] Wrote detailed error stats to results/errs-test-clean-lm_scale_1.3.txt
2021-11-05 22:31:06,585 INFO [test_hlg.py:341] The transcripts are stored in results/recogs-test-clean-lm_scale_1.4.txt
2021-11-05 22:31:06,649 INFO [utils.py:449] [test-clean-lm_scale_1.4] %WER 9.47% [4977 / 52576, 110 ins, 3218 del, 1649 sub ]
2021-11-05 22:31:06,827 INFO [test_hlg.py:350] Wrote detailed error stats to results/errs-test-clean-lm_scale_1.4.txt
2021-11-05 22:31:06,848 INFO [test_hlg.py:341] The transcripts are stored in results/recogs-test-clean-lm_scale_1.5.txt
2021-11-05 22:31:06,913 INFO [utils.py:449] [test-clean-lm_scale_1.5] %WER 10.21% [5367 / 52576, 106 ins, 3595 del, 1666 sub ]
2021-11-05 22:31:07,092 INFO [test_hlg.py:350] Wrote detailed error stats to results/errs-test-clean-lm_scale_1.5.txt
2021-11-05 22:31:07,112 INFO [test_hlg.py:341] The transcripts are stored in results/recogs-test-clean-lm_scale_1.6.txt
2021-11-05 22:31:07,238 INFO [utils.py:449] [test-clean-lm_scale_1.6] %WER 10.86% [5710 / 52576, 99 ins, 3918 del, 1693 sub ]
2021-11-05 22:31:07,419 INFO [test_hlg.py:350] Wrote detailed error stats to results/errs-test-clean-lm_scale_1.6.txt
2021-11-05 22:31:07,439 INFO [test_hlg.py:341] The transcripts are stored in results/recogs-test-clean-lm_scale_1.7.txt
2021-11-05 22:31:07,510 INFO [utils.py:449] [test-clean-lm_scale_1.7] %WER 11.42% [6005 / 52576, 98 ins, 4194 del, 1713 sub ]
2021-11-05 22:31:07,687 INFO [test_hlg.py:350] Wrote detailed error stats to results/errs-test-clean-lm_scale_1.7.txt
2021-11-05 22:31:07,707 INFO [test_hlg.py:341] The transcripts are stored in results/recogs-test-clean-lm_scale_1.8.txt
2021-11-05 22:31:07,772 INFO [utils.py:449] [test-clean-lm_scale_1.8] %WER 11.97% [6292 / 52576, 93 ins, 4474 del, 1725 sub ]
2021-11-05 22:31:07,949 INFO [test_hlg.py:350] Wrote detailed error stats to results/errs-test-clean-lm_scale_1.8.txt
2021-11-05 22:31:07,969 INFO [test_hlg.py:341] The transcripts are stored in results/recogs-test-clean-lm_scale_1.9.txt
2021-11-05 22:31:08,037 INFO [utils.py:449] [test-clean-lm_scale_1.9] %WER 12.47% [6555 / 52576, 91 ins, 4724 del, 1740 sub ]
2021-11-05 22:31:08,216 INFO [test_hlg.py:350] Wrote detailed error stats to results/errs-test-clean-lm_scale_1.9.txt
2021-11-05 22:31:08,235 INFO [test_hlg.py:341] The transcripts are stored in results/recogs-test-clean-lm_scale_2.0.txt
2021-11-05 22:31:08,300 INFO [utils.py:449] [test-clean-lm_scale_2.0] %WER 12.87% [6767 / 52576, 86 ins, 4922 del, 1759 sub ]
2021-11-05 22:31:08,497 INFO [test_hlg.py:350] Wrote detailed error stats to results/errs-test-clean-lm_scale_2.0.txt
2021-11-05 22:31:08,498 INFO [test_hlg.py:364] 
For test-clean, WER of different settings are:
lm_scale_0.4	4.75	best for test-clean
lm_scale_0.3	4.76
lm_scale_0.2	4.83
lm_scale_0.5	4.83
lm_scale_0.6	4.95
lm_scale_0.1	4.97
lm_scale_0.7	5.15
lm_scale_0.8	5.52
lm_scale_0.9	6.03
lm_scale_1.0	6.64
lm_scale_1.1	7.35
lm_scale_1.2	8.08
lm_scale_1.3	8.77
lm_scale_1.4	9.47
lm_scale_1.5	10.21
lm_scale_1.6	10.86
lm_scale_1.7	11.42
lm_scale_1.8	11.97
lm_scale_1.9	12.47
lm_scale_2.0	12.87

100%|███████████████████████████████████████████████████████████████| 2939/2939 [12:58<00:00,  3.77it/s]
2021-11-05 22:44:07,117 INFO [test_hlg.py:341] The transcripts are stored in results/recogs-test-other-lm_scale_0.1.txt
2021-11-05 22:44:07,192 INFO [utils.py:449] [test-other-lm_scale_0.1] %WER 11.57% [6058 / 52343, 536 ins, 1502 del, 4020 sub ]
2021-11-05 22:44:07,380 INFO [test_hlg.py:350] Wrote detailed error stats to results/errs-test-other-lm_scale_0.1.txt
2021-11-05 22:44:07,401 INFO [test_hlg.py:341] The transcripts are stored in results/recogs-test-other-lm_scale_0.2.txt
2021-11-05 22:44:07,470 INFO [utils.py:449] [test-other-lm_scale_0.2] %WER 11.24% [5884 / 52343, 486 ins, 1544 del, 3854 sub ]
2021-11-05 22:44:07,724 INFO [test_hlg.py:350] Wrote detailed error stats to results/errs-test-other-lm_scale_0.2.txt
2021-11-05 22:44:07,745 INFO [test_hlg.py:341] The transcripts are stored in results/recogs-test-other-lm_scale_0.3.txt
2021-11-05 22:44:07,815 INFO [utils.py:449] [test-other-lm_scale_0.3] %WER 10.93% [5722 / 52343, 432 ins, 1638 del, 3652 sub ]
2021-11-05 22:44:08,001 INFO [test_hlg.py:350] Wrote detailed error stats to results/errs-test-other-lm_scale_0.3.txt
2021-11-05 22:44:08,022 INFO [test_hlg.py:341] The transcripts are stored in results/recogs-test-other-lm_scale_0.4.txt
2021-11-05 22:44:08,089 INFO [utils.py:449] [test-other-lm_scale_0.4] %WER 10.83% [5670 / 52343, 384 ins, 1758 del, 3528 sub ]
2021-11-05 22:44:08,275 INFO [test_hlg.py:350] Wrote detailed error stats to results/errs-test-other-lm_scale_0.4.txt
2021-11-05 22:44:08,297 INFO [test_hlg.py:341] The transcripts are stored in results/recogs-test-other-lm_scale_0.5.txt
2021-11-05 22:44:08,366 INFO [utils.py:449] [test-other-lm_scale_0.5] %WER 10.89% [5700 / 52343, 337 ins, 1964 del, 3399 sub ]
2021-11-05 22:44:08,554 INFO [test_hlg.py:350] Wrote detailed error stats to results/errs-test-other-lm_scale_0.5.txt
2021-11-05 22:44:08,576 INFO [test_hlg.py:341] The transcripts are stored in results/recogs-test-other-lm_scale_0.6.txt
2021-11-05 22:44:08,645 INFO [utils.py:449] [test-other-lm_scale_0.6] %WER 11.12% [5820 / 52343, 298 ins, 2262 del, 3260 sub ]
2021-11-05 22:44:08,830 INFO [test_hlg.py:350] Wrote detailed error stats to results/errs-test-other-lm_scale_0.6.txt
2021-11-05 22:44:08,851 INFO [test_hlg.py:341] The transcripts are stored in results/recogs-test-other-lm_scale_0.7.txt
2021-11-05 22:44:08,920 INFO [utils.py:449] [test-other-lm_scale_0.7] %WER 11.58% [6063 / 52343, 267 ins, 2650 del, 3146 sub ]
2021-11-05 22:44:09,171 INFO [test_hlg.py:350] Wrote detailed error stats to results/errs-test-other-lm_scale_0.7.txt
2021-11-05 22:44:09,191 INFO [test_hlg.py:341] The transcripts are stored in results/recogs-test-other-lm_scale_0.8.txt
2021-11-05 22:44:09,262 INFO [utils.py:449] [test-other-lm_scale_0.8] %WER 12.16% [6365 / 52343, 242 ins, 3063 del, 3060 sub ]
2021-11-05 22:44:09,448 INFO [test_hlg.py:350] Wrote detailed error stats to results/errs-test-other-lm_scale_0.8.txt
2021-11-05 22:44:09,468 INFO [test_hlg.py:341] The transcripts are stored in results/recogs-test-other-lm_scale_0.9.txt
2021-11-05 22:44:09,536 INFO [utils.py:449] [test-other-lm_scale_0.9] %WER 12.94% [6771 / 52343, 212 ins, 3574 del, 2985 sub ]
2021-11-05 22:44:09,741 INFO [test_hlg.py:350] Wrote detailed error stats to results/errs-test-other-lm_scale_0.9.txt
2021-11-05 22:44:09,763 INFO [test_hlg.py:341] The transcripts are stored in results/recogs-test-other-lm_scale_1.0.txt
2021-11-05 22:44:09,834 INFO [utils.py:449] [test-other-lm_scale_1.0] %WER 13.88% [7267 / 52343, 201 ins, 4163 del, 2903 sub ]
2021-11-05 22:44:10,018 INFO [test_hlg.py:350] Wrote detailed error stats to results/errs-test-other-lm_scale_1.0.txt
2021-11-05 22:44:10,039 INFO [test_hlg.py:341] The transcripts are stored in results/recogs-test-other-lm_scale_1.1.txt
2021-11-05 22:44:10,109 INFO [utils.py:449] [test-other-lm_scale_1.1] %WER 14.96% [7829 / 52343, 181 ins, 4800 del, 2848 sub ]
2021-11-05 22:44:10,291 INFO [test_hlg.py:350] Wrote detailed error stats to results/errs-test-other-lm_scale_1.1.txt
2021-11-05 22:44:10,311 INFO [test_hlg.py:341] The transcripts are stored in results/recogs-test-other-lm_scale_1.2.txt
2021-11-05 22:44:10,380 INFO [utils.py:449] [test-other-lm_scale_1.2] %WER 16.02% [8387 / 52343, 164 ins, 5460 del, 2763 sub ]
2021-11-05 22:44:10,585 INFO [test_hlg.py:350] Wrote detailed error stats to results/errs-test-other-lm_scale_1.2.txt
2021-11-05 22:44:10,605 INFO [test_hlg.py:341] The transcripts are stored in results/recogs-test-other-lm_scale_1.3.txt
2021-11-05 22:44:10,672 INFO [utils.py:449] [test-other-lm_scale_1.3] %WER 17.19% [9000 / 52343, 147 ins, 6150 del, 2703 sub ]
2021-11-05 22:44:10,920 INFO [test_hlg.py:350] Wrote detailed error stats to results/errs-test-other-lm_scale_1.3.txt
2021-11-05 22:44:10,940 INFO [test_hlg.py:341] The transcripts are stored in results/recogs-test-other-lm_scale_1.4.txt
2021-11-05 22:44:11,010 INFO [utils.py:449] [test-other-lm_scale_1.4] %WER 18.22% [9537 / 52343, 136 ins, 6726 del, 2675 sub ]
2021-11-05 22:44:11,194 INFO [test_hlg.py:350] Wrote detailed error stats to results/errs-test-other-lm_scale_1.4.txt
2021-11-05 22:44:11,214 INFO [test_hlg.py:341] The transcripts are stored in results/recogs-test-other-lm_scale_1.5.txt
2021-11-05 22:44:11,281 INFO [utils.py:449] [test-other-lm_scale_1.5] %WER 19.25% [10078 / 52343, 125 ins, 7293 del, 2660 sub ]
2021-11-05 22:44:11,481 INFO [test_hlg.py:350] Wrote detailed error stats to results/errs-test-other-lm_scale_1.5.txt
2021-11-05 22:44:11,502 INFO [test_hlg.py:341] The transcripts are stored in results/recogs-test-other-lm_scale_1.6.txt
2021-11-05 22:44:11,569 INFO [utils.py:449] [test-other-lm_scale_1.6] %WER 20.24% [10595 / 52343, 116 ins, 7818 del, 2661 sub ]
2021-11-05 22:44:11,752 INFO [test_hlg.py:350] Wrote detailed error stats to results/errs-test-other-lm_scale_1.6.txt
2021-11-05 22:44:11,772 INFO [test_hlg.py:341] The transcripts are stored in results/recogs-test-other-lm_scale_1.7.txt
2021-11-05 22:44:11,839 INFO [utils.py:449] [test-other-lm_scale_1.7] %WER 21.07% [11028 / 52343, 111 ins, 8238 del, 2679 sub ]
2021-11-05 22:44:12,022 INFO [test_hlg.py:350] Wrote detailed error stats to results/errs-test-other-lm_scale_1.7.txt
2021-11-05 22:44:12,042 INFO [test_hlg.py:341] The transcripts are stored in results/recogs-test-other-lm_scale_1.8.txt
2021-11-05 22:44:12,116 INFO [utils.py:449] [test-other-lm_scale_1.8] %WER 21.78% [11400 / 52343, 111 ins, 8626 del, 2663 sub ]
2021-11-05 22:44:12,363 INFO [test_hlg.py:350] Wrote detailed error stats to results/errs-test-other-lm_scale_1.8.txt
2021-11-05 22:44:12,383 INFO [test_hlg.py:341] The transcripts are stored in results/recogs-test-other-lm_scale_1.9.txt
2021-11-05 22:44:12,452 INFO [utils.py:449] [test-other-lm_scale_1.9] %WER 22.43% [11739 / 52343, 109 ins, 8967 del, 2663 sub ]
2021-11-05 22:44:12,642 INFO [test_hlg.py:350] Wrote detailed error stats to results/errs-test-other-lm_scale_1.9.txt
2021-11-05 22:44:12,663 INFO [test_hlg.py:341] The transcripts are stored in results/recogs-test-other-lm_scale_2.0.txt
2021-11-05 22:44:12,730 INFO [utils.py:449] [test-other-lm_scale_2.0] %WER 23.06% [12070 / 52343, 104 ins, 9319 del, 2647 sub ]
2021-11-05 22:44:12,921 INFO [test_hlg.py:350] Wrote detailed error stats to results/errs-test-other-lm_scale_2.0.txt
2021-11-05 22:44:12,922 INFO [test_hlg.py:364] 
For test-other, WER of different settings are:
lm_scale_0.4	10.83	best for test-other
lm_scale_0.5	10.89
lm_scale_0.3	10.93
lm_scale_0.6	11.12
lm_scale_0.2	11.24
lm_scale_0.1	11.57
lm_scale_0.7	11.58
lm_scale_0.8	12.16
lm_scale_0.9	12.94
lm_scale_1.0	13.88
lm_scale_1.1	14.96
lm_scale_1.2	16.02
lm_scale_1.3	17.19
lm_scale_1.4	18.22
lm_scale_1.5	19.25
lm_scale_1.6	20.24
lm_scale_1.7	21.07
lm_scale_1.8	21.78
lm_scale_1.9	22.43
lm_scale_2.0	23.06

2021-11-05 22:44:12,922 INFO [test_hlg.py:531] Done!

recogs-test-clean-lm_scale_0.4.txt recogs-test-other-lm_scale_0.4.txt

My environment:

(k2-python) luomingshuang@de-74279-k2-train-5-0816110343-9647676d5-sqn62:/ceph-meixu/luomingshuang/speechbrain/recipes/LibriSpeech/ASR/wfst/results$ python3 -m k2.version
Collecting environment information...

k2 version: 1.8
Build type: Release
Git SHA1: 210175c08ba8ca4b0e172a59a4f6fb4c677b176c
Git date: Tue Sep 14 08:51:29 2021
Cuda used to build k2: 10.2
cuDNN used to build k2: 8.0.2
Python version used to build k2: 3.8
OS used to build k2: Ubuntu 16.04.7 LTS
CMake version: 3.18.4
GCC version: 5.5.0
CMAKE_CUDA_FLAGS:  --expt-extended-lambda -gencode arch=compute_35,code=sm_35 --expt-extended-lambda -gencode arch=compute_50,code=sm_50 --expt-extended-lambda -gencode arch=compute_60,code=sm_60 --expt-extended-lambda -gencode arch=compute_61,code=sm_61 --expt-extended-lambda -gencode arch=compute_70,code=sm_70 --expt-extended-lambda -gencode arch=compute_75,code=sm_75 -D_GLIBCXX_USE_CXX11_ABI=0 --compiler-options -Wall --compiler-options -Wno-unknown-pragmas --compiler-options -Wno-strict-overflow
CMAKE_CXX_FLAGS:  -D_GLIBCXX_USE_CXX11_ABI=0 -Wno-strict-overflow
PyTorch version used to build k2: 1.8.1
PyTorch is using Cuda: 10.2
NVTX enabled: True
With CUDA: True
Disable debug: True
Sync kernels : False
Disable checks: False

If you are using k2-v1.10 to run this scripts, I will re-run with k2-v1.10 .

Nov 05 '21 14:11 luomingshuang

thanks @luomingshuang , I'm able to have same results with torch=1.8. After adding our CategoricalLabel_encoding, I will move to 1.10..

Let me highlight a small problem where I think we need to fit K2 use. We have in Spbrain the label encoder class which help to map id2tokens and others target management.. here is an example:

'q' => 34
'u' => 1
'e' => 2
' ' => 3
'n' => 4
'o' => 5
'v' => 6
'l' => 7
's' => 8
'c' => 9
'r' => 10
'd' => 11
'é' => 12
'b' => 13
'a' => 14
'i' => 15
"'" => 16
't' => 17
'm' => 18
'ç' => 19
'g' => 20
'p' => 21
'j' => 22
'y' => 23
'-' => 24
'x' => 25
'z' => 26
'f' => 27
'k' => 28
'h' => 29
'à' => 30
'è' => 31
'ê' => 32
'w' => 33
'<blank>' => 0
================
'starting_index' => 0
'blank_label' => '<blank>'

If you see, token 3 (which is space) is labeled ' ' instead of '<space>' or in sentence piece '_'... This lead to an error when using K2, specifically in tokens.txt where id=3 is mapped to ' ':

q 34
u 1
e 2
   3
n 4
o 5
v 6
l 7
s 8
c 9
....

ERROR:

Traceback (most recent call last):
  File "graph_based_generator.py", line 159, in <module>
    main()
  File "graph_based_generator.py", line 154, in main
    HLG = compile_HLG(inlang, inG, outgraph)
  File "/home/aheba/ASR_LVCSR_E2E/K2/speechbrain/recipes/LibriSpeech/ASR/wfst/Graph-BASED/graph_based_gen/utils/make_TLG.py", line 22, in compile_HLG
    lexicon = Lexicon(lang_dir)
  File "/home/aheba/ASR_LVCSR_E2E/K2/speechbrain/recipes/LibriSpeech/ASR/wfst/Graph-BASED/graph_based_gen/utils/lexicon.py", line 134, in __init__
    self.token_table = k2.SymbolTable.from_file(lang_dir +"/tokens.txt")
  File "/home/aheba/anaconda3/envs/k2-decode-spbrain/lib/python3.8/site-packages/k2/symbol_table.py", line 132, in from_file
    return SymbolTable.from_str(f.read().strip())
  File "/home/aheba/anaconda3/envs/k2-decode-spbrain/lib/python3.8/site-packages/k2/symbol_table.py", line 97, in from_str
    assert len(fields) == 2, \
AssertionError: Expect a line with 2 fields. Given: 1

The problem here is the 3 where give only ID field, the space is removed with : https://github.com/k2-fsa/k2/blob/e8c589a47e8acb19c9c08df5ceaba7d19d78238e/k2/python/k2/symbol_table.py#L93

in that case, @danpovey, We should add in word.txt a word: '<SPACE>' right ?

Additional question: Do you have some function for validating lang dir ?

Nov 07 '21 11:11 aheba

In fact, in k2 and icefall, adding a <SPACE> unit is not necessary, I think. Because the <eps> unit can play this role.

Nov 07 '21 15:11 luomingshuang

About some functions for validating lang dir, is this file (https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/local/test_prepare_lang.py) you need?

Generally, the lexicon.txt is needed. And we use lexicon.txt to generate tokens.txt, words.txt, L.pt and L_disambig.pt. You can read this file (https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/local/prepare_lang.py). Then, we use words.txt and LM files (xxx.arpa) to generate G_3_gram.fst.txt. You can read this file (https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/prepare.sh). Last, Using L_disambig.pt, G_3_gram.fst.txt and tokens.txt (mainly get max_token_id) to generate HLG.pt.

Nov 07 '21 15:11 luomingshuang

If you see, token 3 (which is space) is labeled ' ' instead of '' or in sentence piece '_'

You can modify the script so that if a line contains only one field, then this field must be an ID and the missing field is assumed to be " ", i.e., a space.

Nov 07 '21 16:11 csukuangfj

Hello @csukuangfj , @luomingshuang , Thanks you for the tricks. from my side, it seems not working;

Two errors: 1st one if I consider @csukuangfj tricks:

Removing disambiguation symbols on LG
Traceback (most recent call last):
  File "compile_hlg.py", line 53, in <module>
    main()
  File "compile_hlg.py", line 46, in main
    HLG = compile_HLG(outlang, outgraph, outgraph)
  File "/home/aheba/ASR_LVCSR_E2E/K2/speechbrain/recipes/LibriSpeech/ASR/wfst/Graph-BASED/lvcsr/utils/make_TLG.py", line 71, in compile_HLG
    assert isinstance(LG.aux_labels, k2.RaggedTensor)
AssertionError

2nd one if I consider adding in the lexicon the word <SPACE> and token <space>: So the HLG compile, but in the decoding phase, I have an error when using the HLG.fst

[F] /usr/share/miniconda/envs/k2/conda-bld/k2_1635921650513/work/k2/csrc/context.h:249:k2::ContextPtr k2::GetContext(const First&, const Rest& ...) [with First = k2::RaggedShape; Rest = {k2::RaggedShape}; k2::ContextPtr = std::shared_ptr<k2::Context>] Check failed: ans1->IsCompatible(*ans2) Contexts are not compatible


[ Stack-Trace: ]
/home/aheba/anaconda3/envs/k2-decode-spbrain/lib/python3.8/site-packages/libk2_log.so(k2::internal::GetStackTrace()+0x47) [0x7f60555fc657]
/home/aheba/anaconda3/envs/k2-decode-spbrain/lib/python3.8/site-packages/libk2context.so(k2::internal::Logger::~Logger()+0x5a) [0x7f60558ebe0a]
/home/aheba/anaconda3/envs/k2-decode-spbrain/lib/python3.8/site-packages/libk2context.so(std::shared_ptr<k2::Context> k2::GetContext<k2::RaggedShape, k2::RaggedShape>(k2::RaggedShape const&, k2::RaggedShape const&)+0x1a7) [0x7f6055a72a37]
/home/aheba/anaconda3/envs/k2-decode-spbrain/lib/python3.8/site-packages/libk2context.so(k2::MultiGraphDenseIntersectPruned::MultiGraphDenseIntersectPruned(k2::Ragged<k2::Arc>&, k2::DenseFsaVec&, float, float, int, int)+0x469) [0x7f6055a83169]
/home/aheba/anaconda3/envs/k2-decode-spbrain/lib/python3.8/site-packages/libk2context.so(k2::IntersectDensePruned(k2::Ragged<k2::Arc>&, k2::DenseFsaVec&, float, float, int, int, k2::Ragged<k2::Arc>*, k2::Array1<int>*, k2::Array1<int>*)+0x98) [0x7f6055a6d688]
/home/aheba/anaconda3/envs/k2-decode-spbrain/lib/python3.8/site-packages/_k2.cpython-38-x86_64-linux-gnu.so(+0x6b34f) [0x7f605aef834f]
/home/aheba/anaconda3/envs/k2-decode-spbrain/lib/python3.8/site-packages/_k2.cpython-38-x86_64-linux-gnu.so(+0x2531a) [0x7f605aeb231a]
python3(+0x13c7ae) [0x555f078f17ae]
python3(_PyObject_MakeTpCall+0x3bf) [0x555f078e625f]
python3(_PyEval_EvalFrameDefault+0x540a) [0x555f0798fe5a]
python3(_PyEval_EvalCodeWithName+0x260) [0x555f079811f0]
python3(_PyFunction_Vectorcall+0x534) [0x555f07982754]
python3(PyObject_Call+0x7d) [0x555f078ec57d]
/home/aheba/anaconda3/envs/k2-decode-spbrain/lib/python3.8/site-packages/torch/lib/libtorch_python.so(THPFunction_apply(_object*, _object*)+0xb80) [0x7f60cf0bbce0]
python3(+0x13c83d) [0x555f078f183d]
python3(_PyObject_MakeTpCall+0x3bf) [0x555f078e625f]
python3(_PyEval_EvalFrameDefault+0x5437) [0x555f0798fe87]
python3(_PyEval_EvalCodeWithName+0x260) [0x555f079811f0]
python3(_PyFunction_Vectorcall+0x594) [0x555f079827b4]
python3(_PyEval_EvalFrameDefault+0x1517) [0x555f0798bf67]
python3(_PyEval_EvalCodeWithName+0x260) [0x555f079811f0]
python3(_PyFunction_Vectorcall+0x594) [0x555f079827b4]
python3(_PyEval_EvalFrameDefault+0x1517) [0x555f0798bf67]
python3(_PyEval_EvalCodeWithName+0xd5f) [0x555f07981cef]
python3(_PyFunction_Vectorcall+0x594) [0x555f079827b4]
python3(_PyEval_EvalFrameDefault+0x1517) [0x555f0798bf67]
python3(_PyFunction_Vectorcall+0x1b7) [0x555f079823d7]
python3(PyObject_Call+0x7d) [0x555f078ec57d]
python3(_PyEval_EvalFrameDefault+0x1dd3) [0x555f0798c823]
python3(_PyEval_EvalCodeWithName+0xd5f) [0x555f07981cef]
python3(_PyFunction_Vectorcall+0x594) [0x555f079827b4]
python3(_PyEval_EvalFrameDefault+0x71a) [0x555f0798b16a]
python3(_PyEval_EvalCodeWithName+0x260) [0x555f079811f0]
python3(PyEval_EvalCode+0x23) [0x555f07982aa3]
python3(+0x241382) [0x555f079f6382]
python3(+0x252202) [0x555f07a07202]
python3(+0x2553ab) [0x555f07a0a3ab]
python3(PyRun_SimpleFileExFlags+0x1bf) [0x555f07a0a58f]
python3(Py_RunMain+0x3a9) [0x555f07a0aa69]
python3(Py_BytesMain+0x39) [0x555f07a0ac69]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3) [0x7f60eb0b40b3]
python3(+0x1f7427) [0x555f079ac427]
Traceback (most recent call last):
  File "decode_hlg.py", line 440, in <module>
    main()
  File "/home/aheba/anaconda3/envs/k2-decode-spbrain/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
    return func(*args, **kwargs)
  File "decode_hlg.py", line 415, in main
    hyps_dict = decode_one_batch(
  File "decode_hlg.py", line 204, in decode_one_batch
    lattice = get_lattice(
  File "/home/aheba/ASR_LVCSR_E2E/K2/speechbrain/recipes/LibriSpeech/ASR/wfst/Graph-BASED/lvcsr/utils/decode.py", line 118, in get_lattice
    lattice = k2.intersect_dense_pruned(
  File "/home/aheba/anaconda3/envs/k2-decode-spbrain/lib/python3.8/site-packages/k2/autograd.py", line 717, in intersect_dense_pruned
    _IntersectDensePrunedFunction.apply(a_fsas, b_fsas, out_fsa, search_beam,
  File "/home/aheba/anaconda3/envs/k2-decode-spbrain/lib/python3.8/site-packages/k2/autograd.py", line 414, in forward
    ragged_arc, arc_map_a, arc_map_b = _k2.intersect_dense_pruned(
RuntimeError:
    Some bad things happened. Please read the above error messages and stack
    trace. If you are using Python, the following command may be helpful:

      gdb --args python /path/to/your/code.py

    (You can use `gdb` to debug the code. Please consider compiling
    a debug version of k2.).

    If you are unable to fix it, please open an issue at:

It seems that the intersect between H_L_G is wrong somewhere...

Nov 08 '21 07:11 aheba

So the HLG compile, but in the decoding phase, I have an error when using the HLG.fst

It seems that the intersect between H_L_G is wrong somewhere...

[F] /usr/share/miniconda/envs/k2/conda-bld/k2_1635921650513/work/k2/csrc/context.h:249:k2::ContextPtr k2::GetContext(const First&, const Rest& ...) [with First = k2::RaggedShape; Rest = {k2::RaggedShape}; k2::ContextPtr = std::shared_ptrk2::Context] Check failed: ans1->IsCompatible(*ans2) Contexts are not compatible

Please make sure all your inputs to intersect are on the same device.

k2::Fsa has an attributedevice, like torch tensors, telling you on which device this FSA is.

Nov 08 '21 07:11 csukuangfj

So the HLG compile, but in the decoding phase, I have an error when using the HLG.fst

It seems that the intersect between H_L_G is wrong somewhere...

[F] /usr/share/miniconda/envs/k2/conda-bld/k2_1635921650513/work/k2/csrc/context.h:249:k2::ContextPtr k2::GetContext(const First&, const Rest& ...) [with First = k2::RaggedShape; Rest = {k2::RaggedShape}; k2::ContextPtr = std::shared_ptrk2::Context] Check failed: ans1->IsCompatible(*ans2) Contexts are not compatible

Please make sure all your inputs to intersect are on the same device.

k2::Fsa has an attributedevice, like torch tensors, telling you on which device this FSA is.

Yeah, it was a device problem, when using: CUDA_VISIBLE_DEVICES=1 python decode_hlg.py my nnetoutput is on cuda:0 (this is due to my HF Config..)

Nov 08 '21 13:11 aheba

Let me start pushing the final updates for handling CategoricalEncoder,

@csukuangfj, @luomingshuang , is there a comparaison between you ctc_topo and the one proposed by Miao in EESEN?

Nov 08 '21 13:11 aheba

is there a comparaison between you ctc_topo and the one proposed by Miao in EESEN?

No, not yet.

The one proposed by Miao in EESEN is very similar to k2.ctc_topo(modified=True), I think.

Nov 08 '21 13:11 csukuangfj

We have provided APIs that are much easier to use in https://github.com/k2-fsa/k2/pull/1096

Is anyone interested to pick up this PR?

Nov 09 '22 06:11 csukuangfj

Yes! @Gastron do you think you can take a look? We definitely need to support decoding with FST as mentioned privately. @MartinKocour could also be interested.

Nov 09 '22 14:11 mravanelli

These APIs in k2 look very nice for HMM and CTC -based models! I unfortunately don't have time to experiment with them at the moment, @mravanelli . Is there something similar for autoregressive models (RNN-T, maybe even Attention-based Encoder-Decoder?) @csukuangfj ?

Nov 11 '22 11:11 Gastron

We also have decoding methods for transducers (not RNN transducers; there are no recurrent connections in our transducer models). If you are interested, please have a look at https://github.com/k2-fsa/icefall

For the C++ APIs, please look at https://github.com/k2-fsa/sherpa

Nov 11 '22 11:11 csukuangfj

Hello,

I am closing this PR as we are currently working on a new integration of k2 in SpeechBrain (see: #2065). If you'd like to contribute feel free to exchange there. Thanks for your work.

Best, Adel

Jul 04 '23 18:07 Adel-Moumen

speechbrain speechbrain copied to clipboard

Add WFST decoding based on k2 for speechbrain

speechbrain
speechbrain copied to clipboard