avim
avim copied to clipboard
Hán/Nôm output
The authors of ChuNom.org have created a Chrome extension called ChromiNom that takes Telex input and transforms it into Hán and Nôm characters on the fly. It downloads these two JSON dictionaries and loads them into client-side storage:
http://www.chunom.org/entry/base_chars/ http://www.chunom.org/entry/generated_chars/
Since the dictionaries are downloaded on demand, Hán–Nôm support wouldn’t add much to AVIM’s download size. However, ChuNom.org advocates a neo-standardization of chữ Nôm, so the dictionaries only contain one character per Vietnamese word. They’ve selected the most common character in most cases.
To be more comprehensive, we could combine this dictionary with the one by Lê Sơn Thanh inside WinVNKey. Lê Sơn Thanh’s dictionary is quite comprehensive but might not be ranked by frequency. Many characters are encoded in the PUA, because they hadn’t been incorporated into Unicode/Unihan at the time. We should leave out PUA characters until a mapping between them and the Unicode equivalents is available.