magpie is an absolutely good tool for text classification, however i run across some problems when I use my own data. I hope someone can give me useful advice, Thanks a lot!
first, I have got the word's vectors in advance, I arranged them as following and stored them in a '.txt' file:
setosa 0.222 0.625 0.068 0.042
setosa 0.167 0.417 0.068 0.042
setosa 0.111 0.5 0.051 0.042
when I run:
magpie.init_word_vectors('/home/exam/magpie-master/magpie-master/Iris_data.txt', vec_dim=128)
then I got the following errors:
Traceback (most recent call last):
File "", line 1, in
File "magpie/main.py", line 237, in init_word_vectors
self.train_word2vec(train_dir, vec_dim=vec_dim)
File "magpie/main.py", line 252, in train_word2vec
self.word2vec_model = train_word2vec(train_dir, vec_dim=vec_dim)
File "magpie/base/word2vec.py", line 120, in train_word2vec
window=WORD2VEC_CONTEXT,
File "/usr/local/lib/python2.7/dist-packages/gensim/models/word2vec.py", line 469, in init
self.build_vocab(sentences, trim_rule=trim_rule)
File "/usr/local/lib/python2.7/dist-packages/gensim/models/word2vec.py", line 533, in build_vocab
self.scan_vocab(sentences, progress_per=progress_per, trim_rule=trim_rule) # initial survey
File "/usr/local/lib/python2.7/dist-packages/gensim/models/word2vec.py", line 545, in scan_vocab
for sentence_no, sentence in enumerate(sentences):
File "magpie/base/word2vec.py", line 108, in iter
files = {filename[:-4] for filename in os.listdir(self.dirname)}
OSError: [Errno 20] Not a directory: '/home/exam/magpie-master/magpie-master/Iris_data.txt'
@exampen init_word_vectors trains the vectors from a corpus. If you want to load prebuilt vectors, you should serialize them in a gensim format and pass them as a parameter to Magpie constructor.