chinese-word-segmentation topic

List chinese-word-segmentation repositories

friso

474
Stars
94
Forks
Watchers

High performance Chinese tokenizer with both GBK and UTF-8 charset support based on MMSEG algorithm developed by ANSI C. Completely based on modular implementation and can be easily embedded in other...

jcseg

905
Stars
212
Forks
Watchers

Jcseg is a light weight NLP framework developed with Java. Provide CJK and English segmentation based on MMSEG algorithm, With also keywords extraction, key sentence extraction, summary extraction imp...

cjieba-py

15
Stars
0
Forks
Watchers

Python cffi binding to CppJieba

g2pC

231
Stars
30
Forks
Watchers

g2pC: A Context-aware Grapheme-to-Phoneme Conversion module for Chinese

WordSeg

191
Stars
40
Forks
Watchers

A PyTorch implementation of a BiLSTM \ BERT \ Roberta (+ BiLSTM + CRF) model for Chinese Word Segmentation (中文分词) .

MicroTokenizer

143
Stars
22
Forks
Watchers

一个微型&算法全面的中文分词引擎 | A micro tokenizer for Chinese

Chinese-Word-Vectors

11.6k
Stars
2.3k
Forks
Watchers

100+ Chinese Word Vectors 上百种预训练中文词向量

monpa

244
Stars
26
Forks
Watchers

MONPA 罔拍是一個提供正體中文斷詞、詞性標註以及命名實體辨識的多任務模型

Jiagu

3.2k
Stars
610
Forks
Watchers

Jiagu深度学习自然语言处理工具 知识图谱关系抽取 中文分词 词性标注 命名实体识别 情感分析 新词发现 关键词 文本摘要 文本聚类