cainteoir-engine icon indicating copy to clipboard operation
cainteoir-engine copied to clipboard

Support transliteration of Chinese characters (pinyin)

Open rhdunn opened this issue 11 years ago • 0 comments

The Chinese character transliteration is based around the pinyin transliteration system. The data for this is in the Unicode Character Data files and the extraction of the pinyin transcriptions should be done by the ucd-tools project.

Specifically, transcriptions for Mandarin, Cantonese and Japanese pronunciations of the Chinese characters should be supported.

In addition, the pinyin pronunciations should have two pronunciation modes:

  1. phonetic/IPA -- an accurate IPA-based phonetic transcription;
  2. Latin/English -- an English approximation of the Chinese.

To be complete, this requires improving the phoneme model to support tone markers.


Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

rhdunn avatar Dec 15 '13 14:12 rhdunn