pkuseg-python issues

Results 115 pkuseg-python issues

Sort by recently updated

训练集中的词分不出来

我的训练集中有一些词，比如“查不到”，大概出现了100次，但是在测试的时候在不加词典的情况下就始终把这个词分不出来，请问这是怎么回事？

1、这个trainFile, testFile这两个文件的数据格式，具体是怎么样的，有没有一个特定的限制，或者是样例？ 2、如果需要进行增量训练，是不是只需要在init_model 这个参数，设置某个领域的参数，比如医疗领域的，我再使用新的数据集进行训练的时候，这个iniit_model参数设置为“medicine” 3、那个训练参数的问题，有没有什么评估的标准，来确定训练多少个train_iter ?

Daoming009

Can't build on macOS

`FileNotFoundError: [Errno 2] No such file or directory: '/Users/umuoy1/.pkuseg/news/unigram_word.txt'` macOS Big Sur11.1 py3.9.5 已经下载`news.zip`到`~/.pkuseg/`

umuoy1

Incorrect weight copy in feature expanding while increasing tag size

In https://github.com/lancopku/pkuseg-python/blob/d581c95e3ddec3f236ebe74fd626b6e1cfe112ee/pkuseg/model.py#L25 While increasing `n_tag` such copy neglects the spacing, and simply put original weights at the ending side. This is inconsistent to the addressing method in `_get_tag_tag_feature_id`.

cuter44

update setup.py to support install from source before numpy is installed

Hey, First, thanks for your work! This PR fixes a bug when no wheel installer is available and installing from source requires pre-installed requirements.

SagiMedina

错误求帮助 FileNotFoundError: [Errno 2] No such file or directory

WARNING: features.pkl does not exist, try loading features.json WARNING: features.json does not exist, try loading using old format Traceback (most recent call last): File "C:/Users/Administrator/Desktop/pytest/test.py", line 3, in seg =...

wwe8866

BUG！加载百万级词库后，会将每个字都单独分开

当用户自定义词库达到百万级别数量时。分词会出现bug,将所有字单独切开。导致这个bug的原因是什么呢？

Fan9

如何使用ctb8的预训练模型？

我想使用第三种安装方式来安装pkuseg，选择了预训练模型0.0.11版本的ctb8数据，请问如何使用source code中的setup.py来配置环境？

Xusihao1996

在user_dict匹配处有个bug

第90行，Preprocesser类，solve函数，应该found=True同时，也加上j = last_word_idx + 1。反例如下： user_dict.txt: 车车在中国执行pkuseg.cut('电动车在上海')时，会cut出电动/车在/上海

kekeadou

自定义的用户词典添加的包含空格的关键词无法生效

> seg2 = pkuseg.pkuseg(model_name='web', user_dict=["Color OS", "前摄像头"]) > > print(seg2.cut("Color OS")) # ['Color', 'OS'] > print(seg2.cut("前摄像头")) # ['前摄像头'] 在用户词典中添加带有空格的关键词，但是在后续的分词过程中并没有生效。

tlemar

pkuseg-python
pkuseg-python copied to clipboard

Metadata

训练集中的词分不出来

训练特定领域的模型的问题

Can't build on macOS

Incorrect weight copy in feature expanding while increasing tag size

update setup.py to support install from source before numpy is installed

错误求帮助 FileNotFoundError: [Errno 2] No such file or directory

BUG！加载百万级词库后，会将每个字都单独分开

如何使用ctb8的预训练模型？

在user_dict匹配处有个bug

自定义的用户词典添加的包含空格的关键词无法生效

← Metadata

Owner

Metadata

pkuseg-python pkuseg-python copied to clipboard

Metadata

← Metadata

Owner

Metadata

pkuseg-python
pkuseg-python copied to clipboard