g2p icon indicating copy to clipboard operation
g2p copied to clipboard

Should UW be included in the phoneme set?

Open jaeseongyou opened this issue 5 years ago • 1 comments

Should UW be included in the phoneme set? It seems g2p.phonemes operates under the general rule of of excluding the 'parent' category when its variants exist. For example, AA is not included since its variants AA0, AA1, AA2 are in the set. Same for AE, AH, AW, AY, etc. But UW seems to be the only exception. Furthermore, when I do simple frequency analyses on sizable corpora (not super rigorously though), UW never occurs while its variants do. I wonder if the phoneme set can safely forgo UW.

jaeseongyou avatar Feb 13 '20 01:02 jaeseongyou

I have the same question for this.

iclementine avatar Apr 07 '21 10:04 iclementine