zhon icon indicating copy to clipboard operation
zhon copied to clipboard

Cantonese Jyutping Support

Open TTWNO-zz opened this issue 8 years ago • 3 comments

For implementation in dragonmapper...

TTWNO-zz avatar Jul 12 '16 16:07 TTWNO-zz

Rebase against latest. Can you squash where necessary?

Can you take out the version part so it can be changed when the release is cut?

tony avatar May 27 '17 23:05 tony

As you can tell, I don't spend a lot of time on my Chinese GitHub projects anymore. Thanks for contributing!

This is a great start. It will need some work before it's ready to merge. I'm happy to work with you on the things that need to be changed.

  1. You have over 2000 Cantonese syllables in this pull request. I believe that's too many.
  2. You're missing some important syllables like jyut from Jyut-ping ;)
  3. This doesn't follow the behavior of other modules in Zhon. You'll probably want to mimic the way that the zhuyin module does it.
    • Use a regular expression pattern of initials and finals, not a list of syllables (you can use a list of syllables for the tests).
    • For constants like tones (use marks instead), please mimic the Python's standard library string module's constants, which are strings, not lists (for one thing lists are mutable).

I'd suggest using a resource like these three charts to create a regular expression pattern like the Zhuyin and Pinyin ones in Zhon.

Here is a short example of jyutping.py using the syllables that end in aa and ai:

characters = 'bpmfdtnlgkhwzcsjaeiouy'

marks = '123456'

syl = syllable = (
    '(?:'
    '(?:(?:[gk]w|[bpmfdtnlzcsgkhwj])?aa)|'
    '(?:(?:[gk]w|ng|[bpmfdtnlzcsgkhwj])?ai)'
    ')[{marks}]'
).format(marks=marks)

tsroten avatar May 28 '17 01:05 tsroten

Awesome! Thanks for the help!

Will get to it when I can :-)

TTWNO-zz avatar May 28 '17 14:05 TTWNO-zz