Chinese-Word-Vectors
Chinese-Word-Vectors copied to clipboard
Data check.
Have you had any data check on the w2v dictionaries like the outliers? What is the range for all the embedding values? Do I need to normalize them?
You don't need to worry about this. All word vectors are trained by ngram2vec toolkit. Ngram2vec toolkit is a superset of word2vec and fasttext toolkit. Thus, you can use these embeddings just like word2vec and fasttext.