Documenting the notation in use.
https://en.wikipedia.org/wiki/ARPABET
cmudict was developed primarily for use in speech recognition. At some point it had ~50 symbols (e.g. aspirated stops like TH, DH; flaps, DX; AX/AH, and other variants. It was believed that maintaining phonetic distinctions was important. Turns out it wasn't (accousrtic modeling got better).
@lenzo-ka @Coeur I agree. At minimum, the comment should say something like, "CMUdict transcriptions use a modified version of ARPABET encodings."
Updated the PR, taking into account the review.
Note: please just apply your desired improvements, no need to wait years for the original author, ah ah.