chinese-word2vec icon indicating copy to clipboard operation
chinese-word2vec copied to clipboard

读取cn.skipgram.bin.tar.gz错误

Open Jacky-Chiu opened this issue 7 years ago • 6 comments

model = gensim.models.KeyedVectors.load_word2vec_format(fdir + 'cn.skipgram.bin.tar.gz', binary=True)

ValueError: invalid literal for int() with base 10: 'cn.skipgram.bin\x00\x00。。。。。。

你好,读取cn.skipgram.bin.tar.gz文件出现这个错误,查了很久都不知道原因

Jacky-Chiu avatar Nov 20 '17 03:11 Jacky-Chiu

解压后再试试?

Senmumu avatar Jan 23 '18 03:01 Senmumu

can you unzip this file and try again?

Senmumu avatar Jan 23 '18 03:01 Senmumu

请参考Mikolov的word2vec的源码读取方式

to-shimo avatar Jan 25 '18 06:01 to-shimo

I got 'utf-8' codec can't decode bytes in position 96-97: unexpected end of data when I try to load the unzipped bin file.

hy9be avatar Mar 31 '18 13:03 hy9be

我使用gensim 加载,未解压的模型,报错:utf-8' codec can't decode bytes in position 96-97: unexpected end of data。如何加载这个模型呢?

liyonglion avatar Apr 12 '18 06:04 liyonglion

Using like this can work for me:

word2vec = gensim.models.KeyedVectors.load_word2vec_format( 'XXX', binary=True, unicode_errors='ignore')

yydai avatar May 20 '18 08:05 yydai