wordvectors icon indicating copy to clipboard operation
wordvectors copied to clipboard

Pre-trained word vectors of 30+ languages

Results 20 wordvectors issues
Sort by recently updated
recently updated
newest added

I use it with the korean language in gensim 4.0.x. thus I used KeyedVectors.load('ko.bin') and KeyedVectors.load_word2vec_format('ko.bin'), but there was an error 'UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position...

Hi, I am trying to load Chinese pretrained word2vec, word_vectors = KeyedVectors.load_word2vec_format(path, binary=True) # C binary format it throws this error.

Hello, First of all, thank you for the pre-trained model. Since there are many ways to train a fasttext model for Korean, I am curious about how you trained your...

Because the advantage of subword model is that we can create the new words from pre-trained characters, I wonder how can I create a new word vector from the data.bin...

Hi, I downloaded the French embeddings, and extracted the zip file. How can I load these embeddings in a python code and return the embeddings for a specified word, e.g.:...

Dear Kyubyong, great work - thank you very much for proving these word vectors! One question: Which model did you use to train your word vectors with word2vec? Skip-gram or...

Thank you very much for this project. It seems very useful. I don't seem to be able to use the fasttext files, at least not the Russian or Turkish ones....

Hi, Could you mention what dictionary(and its version) did you use for Japanese morpheme analyzer (i.e., MeCab) in README? I couldn't find the dictionary via the [mecab-python-0.996](https://pypi.org/project/mecab-python/) page since it...

the latest dump also the articles not all name spaces

Thanks for putting this together! However, the embedding vectors are just the weights from a shallow NN, which offers far less info than a complete language model (just like the...