sense2vec icon indicating copy to clipboard operation
sense2vec copied to clipboard

Train sense2vec in Chinese

Open JingxinLee opened this issue 3 years ago • 1 comments

Try to use Wikipedia Chinese corpus to Train sense2vec. But met a problem which is The 'noun_chunks' syntax iterator is not implemented for language 'zh'. Anyone know how to deal with this? How could I write the lables in noun_chunks function? How can I find the labels I need?

JingxinLee avatar Apr 22 '22 03:04 JingxinLee

This problem is start from doc = merge_phrases(doc), end in https://github.com/explosion/sense2vec/blob/d689bb65ce0f6c597c891cea3ba279ad1f92916f/sense2vec/util.py#L117

I mannully create a syntax_iterators.py within zh. But it doesn't work.

JingxinLee avatar Apr 24 '22 02:04 JingxinLee