chinese-text-segmentation topic

List chinese-text-segmentation repositories

jcseg

905
Stars
212
Forks
Watchers

Jcseg is a light weight NLP framework developed with Java. Provide CJK and English segmentation based on MMSEG algorithm, With also keywords extraction, key sentence extraction, summary extraction imp...

jiebaR

338
Stars
110
Forks
Watchers

Chinese text segmentation with R. R语言中文分词 (文档已更新 🎉 :https://qinwenfeng.com/jiebaR/ )

root-cause

292
Stars
46
Forks
Watchers

Pytorch-NLU,一个中文文本分类、序列标注工具包,支持中文长文本、短文本的多类、多标签分类任务,支持中文命名实体识别、词性标注、分词、抽取式文本摘要等序列标注任务。 Ptorch NLU, a Chinese text classificat...

Pytorch-NLU

292
Stars
46
Forks
Watchers

Pytorch-NLU,一个中文文本分类、序列标注工具包,支持中文长文本、短文本的多类、多标签分类任务,支持中文命名实体识别、词性标注、分词、抽取式文本摘要等序列标注任务。 Ptorch NLU, a Chinese text classificat...

zhparser

661
Stars
81
Forks
Watchers

zhparser is a PostgreSQL extension for full-text search of Chinese language

SymSpell

3.0k
Stars
281
Forks
Watchers

SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm

symspellpy

766
Stars
116
Forks
Watchers

Python port of SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm

hanlp-lucene-plugin

294
Stars
99
Forks
Watchers

HanLP中文分词Lucene插件,支持包括Solr在内的基于Lucene的系统

ik-analyzer

194
Stars
75
Forks
Watchers

Tokenizer support Lucene5/6/7/8/9+ version, LTS

kcws

2.1k
Stars
649
Forks
Watchers

Deep Learning Chinese Word Segment

jieba-php

1.3k
Stars
258
Forks
Watchers

"結巴"中文分詞:做最好的 PHP 中文分詞、中文斷詞組件。 / "Jieba" (Chinese for "to stutter") Chinese text segmentation: built to be the best PHP Chinese word segmentation module.