cmudict icon indicating copy to clipboard operation
cmudict copied to clipboard

Documenting the notation in use.

Open Coeur opened this issue 7 years ago • 3 comments

https://en.wikipedia.org/wiki/ARPABET

Coeur avatar Jan 17 '19 07:01 Coeur

cmudict was developed primarily for use in speech recognition. At some point it had ~50 symbols (e.g. aspirated stops like TH, DH; flaps, DX; AX/AH, and other variants. It was believed that maintaining phonetic distinctions was important. Turns out it wasn't (accousrtic modeling got better).

Alexir avatar Jun 01 '23 12:06 Alexir

@lenzo-ka @Coeur I agree. At minimum, the comment should say something like, "CMUdict transcriptions use a modified version of ARPABET encodings."

danmartinez avatar Jul 17 '23 23:07 danmartinez

Updated the PR, taking into account the review.

Note: please just apply your desired improvements, no need to wait years for the original author, ah ah.

Coeur avatar Jul 22 '23 16:07 Coeur