Chinese-Word-Vectors icon indicating copy to clipboard operation
Chinese-Word-Vectors copied to clipboard

100+ Chinese Word Vectors 上百种预训练中文词向量

Results 58 Chinese-Word-Vectors issues
Sort by recently updated
recently updated
newest added

一些专业词例如'混动‘和'混合动力'好像没有哎

用的标题组合的预训练词向量计算词的相似度,不高啊?请教哪里出现了问题啊 ![image](https://user-images.githubusercontent.com/47171514/182988121-4395ad69-89e6-44ea-a6f3-485d5e87734b.png)

Is there any place to download the models from except for https://pan.baidu.com/ ? I did not succeed to download them from there without registration, and registration only works with Chinese...

我下载下来后,使用如下语句指定训练好的模型,py运行却没有任何反应 model = gensim.models.KeyedVectors.load_word2vec_format('/text/sgns.financial.bigram-char') 而换为另一个混合类的模型,就能正常运行 model = gensim.models.KeyedVectors.load_word2vec_format('/text/merge_sgns_bigram_char300.txt') 这是为什么呢?是不是第一个的格式不对?还是需要另外的语句读取model? 谢谢呀!

请问target word vector 和context word vector的区别是什么,什么时候会用到target word vector

No longer hosted on university server, the files are now hosted on OSF: https://osf.io/c394y/ (you can choose which file to download from the left panel) Hi folks, great resource, thank...

用的维基百科sgns.wiki.word,试了一下 国王-男人+女人 != 王后 向量值相差很远 梨树-树+花 != 梨花 比较了梨花、茶花、水花和花的曼哈顿距离 梨花离花比较远,是85+ 茶花和水花距离花的距离差不多,是77.58和79.27

在使用以下代码加载[搜狗新闻Word + Character + Ngram 300d](https://pan.baidu.com/s/1svFOwFBKnnlsqrF1t99Lnw),名为sgns.sogounews.bigram-char的文件时,发生错误: ```python with open(WORD2VEC_PATH, encoding='utf-8') as f: for l in f.readlines(): values = l.split() word = values[0] embeddings_index[word] = np.asarray(values[1:], dtype='float32') ``` 错误为: > ValueError:...

如果可以通用,则这个词向量语料库没有的可以去别的词向量语料库找。不过不可以通用则不行。