Sung-Lin Yeh
Sung-Lin Yeh
I will add a method to validate if users define scorers correctly. E.g., `coverage_scorer` should always be put into `full_scorers`. Or `ctc_weight` should be 1.0 if `ctc_scorer` is put into...
Hi @mravanelli , the partial scorers score the topk tokens based on the logprobs after full scorers. Scoring all tokens in vocabulary is too expensive for some scorers, e.g. ngram...
I prefer using a list of tuples.
I have a default 0.0 weight if scorers are not specified. I can go for solution 2, with an error message to check if weights and full/partial scorer lists are...
@mravanelli, @Gastron I have adapted those changes to yaml files that involve beamsearch part. For the top-k hypothesis output: I suggest we modify the train.py to obtain best hyps from...
We can make use of `undo_padding`: ``` from speechbrain.utils.data_utils import undo_padding # Select the best hypothesis best_hyps, best_lens = topk_tokens[:, 0, :], topk_lens[:, 0] # Convert best hypothesis to list...
Related to #1550: I actually have addressed the issue at https://github.com/speechbrain/speechbrain/blob/102f9f785f13f96110de02cae7a00df217d4e247/speechbrain/decoders/seq2seq.py#L733-L741 It's not been merged yet as it involves several major changes. I also remember similar issue was raised before,...
Hi, this is a typo. We fixed it in a refactored version but have not merged it yep (Ping @mravanelli). https://github.com/speechbrain/speechbrain/blob/a7c4e44c3176a699cbaac3cd90afc66817b9f7d3/speechbrain/decoders/scorer.py#L569-L574 Regarding `len(cur_attn.size()) > 2`, are you running the transformer...
Please see my reply here #1514.
Hi @csukuangfj, I also spotted this issue. The default args: epoch, and update_until_epoch should be set properly for normalization.