pocolm icon indicating copy to clipboard operation
pocolm copied to clipboard

swbd and/or swbd_fisher example script

Open danpovey opened this issue 9 years ago • 1 comments

We need some example scripts on real data to help compare perplexities against baselines like SRILM and kaldi_lm. [part of this is to work on those baselines.] A Switchboard-only setup, with the first 10k utts (or is it the last?) used as dev data, would be nice- compare with the LM-estimating scripts in Kaldi's switchboard setup. Also (since the main point of this toolkit is for better interpolation), we need an example setup with multiple datasets to be combined, e.g. the Switchboard+Fisher setup that we currently use (optionally) in the Switchboard example scripts in Kaldi.

@vijayaditya, you could help with this if you have tim- you said you were interested in LM stuff.

danpovey avatar May 09 '16 00:05 danpovey

Ok on it.

Vijay On May 9, 2016 07:07, "Daniel Povey" [email protected] wrote:

We need some example scripts on real data to help compare perplexities against baselines like SRILM and kaldi_lm. [part of this is to work on those baselines.] A Switchboard-only setup, with the first 10k utts (or is it the last?) used as dev data, would be nice- compare with the LM-estimating scripts in Kaldi's switchboard setup. Also (since the main point of this toolkit is for better interpolation), we need an example setup with multiple datasets to be combined, e.g. the Switchboard+Fisher setup that we currently use (optionally) in the Switchboard example scripts in Kaldi.

@vijayaditya https://github.com/vijayaditya, you could help with this if you have tim- you said you were interested in LM stuff.

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/danpovey/pocolm/issues/1

vijayaditya avatar May 09 '16 01:05 vijayaditya