Danylo Mysak
Danylo Mysak
The glottal stop (`ʔ`) as well as both voiced (`ɦ`) and voiceless (`h`) glottal fricatives are classified as sonorant. Should they really be?
I’ve used [the chart](https://en.wikipedia.org/wiki/International_Phonetic_Alphabet#Pulmonic_consonants) from Wikipedia to check whether all consonants in there are known by PanPhon. It turns out many are not: `['ɳ̊', 'ɲ̊', 'ŋ̊', 'p̪', 'b̪', 'ʡ', 'θ̼',...
The issue can probably best be demonstrated with an example: ```python from ua_gec import AnnotatedText text = AnnotatedText(r'\n{=>_}') print(len(text.get_original_text())) # output: 1 print(text.get_annotations()[0].start) # output: 2 ```
The output of `iconv -f UTF-8 cmudict-0.7b > /dev/null; echo $?` (as suggested [here](https://stackoverflow.com/questions/115210/how-to-check-whether-a-file-is-valid-utf-8/115262#115262)) is: `iconv: cmudict-0.7b:35733:1: cannot convert` Removing the line fixes things.