timit_tools
timit_tools copied to clipboard
tools around preparing TIMIT for HMM (with HTK) and deep learning (with Theano) methods
Preparing the dataset
With the TIMIT dataset (.wav sound files, .wrd words annotations and .phn phones annotations):
-
Encode the wave sound in MFCCs: run
python mfcc_and_gammatones.py --htk-mfcc $DATASET/trainandpython mfcc_and_gammatones.py --htk-mfcc $DATASET/testproducing the.mfcfiles with HCopy according towav_config(.mfc_unnormis no normalization) -
Adapt the annotations given in .phn in frames into nanoseconds in .lab run
python timit_to_htk_labels.py $DATASET/trainand
python timit_to_htk_labels.py $DATASET/testproducing the.labfiles -
Replace phones according to the seminal HMM paper of 1989: "Speaker-independant phone recognition using hidden Markov models", phones number (i.e. number of lines in the future labels dictionary) should go from 61 to 39. run
python substitute_phones.py $DATASET/trainandpython substitute_phones.py $DATASET/test -
run
python create_phonesMLF_and_labels.py $DATASET/trainandpython create_phonesMLF_and_labels.py $DATASET/test
You can also do that with a make prepare dataset=DATASET_PATH.
You're ready for training with HTK (mfc and lab files)!
Training the HMM models
Train monophones HMM:
make train_monophones dataset_train_folder=PATH_TO_YOUR_DATASET/train
make test_monophones dataset_test_folder=PATH_TO_YOUR_DATASET/test
Or, train triphones:
TODO
make train_triphones dataset_train_folder=PATH_TO_YOUR_DATASET/train
make test_triphones dataset_test_folder=PATH_TO_YOUR_DATASET/test
Replacing the GMM by DBNs
-
Do full states forced alignment of the
.mlffiles withmake align. -
Do a first preparation of the dataset with
src/timit_to_numpy.pyorsrc/mocha_timit_to_numpy.py(depending on the dataset) on the above aligned.mlffiles. -
Train the deep belief networks on it, either using
DBN/DBN_timit.pyorDBN/DBN_Gaussian_timit.pyorDBN/DBN_Gaussian_mocha_timit.py(see inside these files for parameters). Save (pickle at the moment) the DBN objects and the states/indices mappings. -
Use the serialized DBN objects and states/indices mappings with
viterbi.py, justcdtoDBNand do:python ../src/viterbi.py output_dbn.mlf /fhgfs/bootphon/scratch/gsynnaeve/TIMIT/test/test.scp ../tmp_train/hmm_final/hmmdefs --d ../dbn_5.pickle ../to_int_and_to_state_dicts_tuple.pickle