random-indexing-wordrepresentations
random-indexing-wordrepresentations copied to clipboard
Induce word representations using random indexing (RI)
random-indexing-wordrepresentations
by Joseph Turian
Induce word representations using random indexing (RI).
For information about random indexing, see: http://www.sics.se/~mange/random_indexing.html
You can control the hyperparameters by editing hyperparameters.random-indexing.yaml Or you can control the hyperparameters using command-line options.
Another implementation (in Java) is semanticvectors: http://code.google.com/p/semanticvectors/
See also: http://github.com/turian/pyrandomprojection a generic library for transforming a Python dictionary into a low-dimensional numpy array.
This code is based upon my neural language model code (http://github.com/turian/neural-language-model), so it shares similar idioms for the data it expects. For example, instead of building a vocabulary on-the-fly, we assume that the vocabulary has been preextracted and will be read in. Also, we assume that there is one-sentence-per-line in the training input.
REQUIREMENTS: * My python common code: http://github.com/turian/common