THULAC-Python
THULAC-Python copied to clipboard

Published 20 hours ago •

→

Metadata

An Efficient Lexical Analyzer for Chinese

Reame
Issues

Results 87 THULAC-Python issues

Sort by recently updated

速度非常慢啊

1

comment

在windows上，2.4G的CPU，分词+词性标注，30M的文本，跑了几个小时

thu1 = thulac.thulac() memory error

5

comment

Traceback (most recent call last): File "", line 1, in File "C:\Python27\lib\site-packages\thulac\__init__.py", line 58, in __init__ self.__tagging_decoder.init((self.__prefix+"model_c_model.bin"),(self.__prefix+"model_c_dat.bin"),(self.__prefix+"model_c_label.txt")) File "C:\Python27\lib\site-packages\thulac\character\CBTaggingDecoder.py", line 36, in init self.model = CBModel(modelFile) File "C:\Python27\lib\site-packages\thulac\character\CBModel.py", line 58,...

HappyShadowWalker

用户自定义词典不生效

3

comment

rt

segement tag, not available for python 3.8

8

comment

when I try thulac.cut($sentence), error jump out: ``` tmp, tagged = self.__tagging_decoder.segmentTag(raw, __poc_cands) start = time.clock() AttributeError: module 'time' has no attribute 'clock' ``` it turns out that function time.clock()...

AttributeError: module 'time' has no attribute 'clock'

3

comment

查了下，不支持python3.8及更高版本，需要手动降级到3.7以下，建议直接修复一下，很简单的改动

新版取消了 -deli 参数吗？

2

comment

请问新版是取消了 -deli 参数吗？ -deli delimeter 设置词与词性间的分隔符，默认为下划线_

请问为什么txt的格式是utf-8还会出现这个问题

2

comment

UnicodeDecodeError: 'gbk' codec can't decode byte 0xa8 in position 0: incomplete multibyte sequence

解决AttributeError: module 'time' has no attribute 'clock'问题，在python3.9.5+Windows环境下可运行

问题在于start = time.clock()这行代码调用time.clock()这个已经不被支持的函数，后来仔细一看，start这个变量被赋值后没有用过，也就是说这是一个没用的变量，把这行删了之后可以在python3.9.5+Windows环境下正常运行。

haomingdouranggouqil

module 'time' has no attribute 'clock'

4

comment

在Python 3.8中，time.clock()已经被移除了。但是切割句子时仍然使用了这个。

BrandNewJimZhang

UnicodeDecodeError: 'gbk' codec can't decode byte 0xab in position 8: illegal multibyte sequence

1

comment

Win11下，python命令行模式，不知道为什么要求使用GBK（试过Linux是用UTF8正常的），结果也需要切换到GBK才能正常阅读

1
2
3
4
5
6
7
8
9
›

About

An Efficient Lexical Analyzer for Chinese

chinese-nlp

2.0k

Stars

334

Forks

Watchers

Owner

← Metadata

2.0k

Stars

334

Forks

Watchers

Owner

Metadata

An Efficient Lexical Analyzer for Chinese