icefall
icefall copied to clipboard
Phone based lang vs BPE
Hi guys,
I need to use phones based lang because I've lots of words that I need to provide the exact pronunciations for them. I'm wondering what would be the best way for this use case? My understanding is that, with BPE models it's the model itself which will generate the tokens(phones) for each word and I can't provide phone sequence for new words with different pronunciations.
Thanks
fo example, in the egs where HLG is used in decoding, it should be possible for you to impose a pronunciation to words youself, i..e manually create the L, but the pronunciation should be made up anyway by tokens in the BPE model you have trained