avim icon indicating copy to clipboard operation
avim copied to clipboard

Hán/Nôm output

Open 1ec5 opened this issue 10 years ago • 0 comments

The authors of ChuNom.org have created a Chrome extension called ChromiNom that takes Telex input and transforms it into Hán and Nôm characters on the fly. It downloads these two JSON dictionaries and loads them into client-side storage:

http://www.chunom.org/entry/base_chars/ http://www.chunom.org/entry/generated_chars/

Since the dictionaries are downloaded on demand, Hán–Nôm support wouldn’t add much to AVIM’s download size. However, ChuNom.org advocates a neo-standardization of chữ Nôm, so the dictionaries only contain one character per Vietnamese word. They’ve selected the most common character in most cases.

To be more comprehensive, we could combine this dictionary with the one by Lê Sơn Thanh inside WinVNKey. Lê Sơn Thanh’s dictionary is quite comprehensive but might not be ranked by frequency. Many characters are encoded in the PUA, because they hadn’t been incorporated into Unicode/Unihan at the time. We should leave out PUA characters until a mapping between them and the Unicode equivalents is available.

1ec5 avatar Feb 15 '15 22:02 1ec5