model-zoo
model-zoo copied to clipboard
speach-blstm: parsed TMIT labels do not look right
The vast majority of labels from TIMIT corpus - all except 49, i.e. ~98.5% - have all 8 samples parsed as having the true bit in the first position, i.e. encoding the phoneme "h#". Surely this cannot be right. I suspect the offending code is in makeFeatures
in 00-data.jl
. I am not hugely familiar with how mfcc works, and how the frames should be aligned with the labels from phn, so would need some help in fixing this.
TIMIT was downloaded from academic torrents, and wav files were then treated with sox.