word-embedding-dimensionality-selection
word-embedding-dimensionality-selection copied to clipboard
On the Dimensionality of Word Embedding
Thanks for sharing the code! Sorry if the question is silly - my understanding of word embeddings is still premature and lack the required math background: Should the SignalMatrix implementation...
Hi, first thank you for making python implementation for your paper. When I executed package in cmd, I ran into an error. My environment is windows 10, 64 bit and...
error message: Segmentation fault (core dumped) nohup python -m main --file data/train.txt --config_file config/train.yml --algorithm word2vec
Line: self.noise = np.std(diff) * 0.5 seems to be inconsistent with Frobenius term used in paper: ||M1-M2||/(2sqrt(mn)). Instead it should be np.sum(diff**2)**(0.5)/(2*n)? Btw, isn't m=n=vocabulary size? Under this assumption, what's...
cat config/word2vec_sample_config.yml skip_window: 5 neg_samples: 1 vocabulary_size: 100000 min_count: 100
A beautiful work by you. Hope to see similar work for other types of embeddings like contextual word embeddings. Will this work with fastext ? If no, what files I...
I have a question about the format of corpus, I noticed the text8 corpus in data folder write in just one line, I want to know there is no influence...