Chinese-Word-Vectors Data check.

Data check.

Open zhangjh915 opened this issue 5 years ago • 1 comments

Have you had any data check on the w2v dictionaries like the outliers? What is the range for all the embedding values? Do I need to normalize them?

Jul 22 '19 10:07 zhangjh915

You don't need to worry about this. All word vectors are trained by ngram2vec toolkit. Ngram2vec toolkit is a superset of word2vec and fasttext toolkit. Thus, you can use these embeddings just like word2vec and fasttext.

Jul 23 '19 02:07 shenshen-hungry

Chinese-Word-Vectors Chinese-Word-Vectors copied to clipboard

Data check.

Chinese-Word-Vectors
Chinese-Word-Vectors copied to clipboard