pocolm
pocolm copied to clipboard
swbd and/or swbd_fisher example script
We need some example scripts on real data to help compare perplexities against baselines like SRILM and kaldi_lm. [part of this is to work on those baselines.] A Switchboard-only setup, with the first 10k utts (or is it the last?) used as dev data, would be nice- compare with the LM-estimating scripts in Kaldi's switchboard setup. Also (since the main point of this toolkit is for better interpolation), we need an example setup with multiple datasets to be combined, e.g. the Switchboard+Fisher setup that we currently use (optionally) in the Switchboard example scripts in Kaldi.
@vijayaditya, you could help with this if you have tim- you said you were interested in LM stuff.
Ok on it.
Vijay On May 9, 2016 07:07, "Daniel Povey" [email protected] wrote:
We need some example scripts on real data to help compare perplexities against baselines like SRILM and kaldi_lm. [part of this is to work on those baselines.] A Switchboard-only setup, with the first 10k utts (or is it the last?) used as dev data, would be nice- compare with the LM-estimating scripts in Kaldi's switchboard setup. Also (since the main point of this toolkit is for better interpolation), we need an example setup with multiple datasets to be combined, e.g. the Switchboard+Fisher setup that we currently use (optionally) in the Switchboard example scripts in Kaldi.
@vijayaditya https://github.com/vijayaditya, you could help with this if you have tim- you said you were interested in LM stuff.
— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/danpovey/pocolm/issues/1