random-indexing-wordrepresentations icon indicating copy to clipboard operation
random-indexing-wordrepresentations copied to clipboard

Induce word representations using random indexing (RI)

random-indexing-wordrepresentations

by Joseph Turian

Induce word representations using random indexing (RI).

For information about random indexing, see: http://www.sics.se/~mange/random_indexing.html

You can control the hyperparameters by editing hyperparameters.random-indexing.yaml Or you can control the hyperparameters using command-line options.

Another implementation (in Java) is semanticvectors: http://code.google.com/p/semanticvectors/

See also: http://github.com/turian/pyrandomprojection a generic library for transforming a Python dictionary into a low-dimensional numpy array.

This code is based upon my neural language model code (http://github.com/turian/neural-language-model), so it shares similar idioms for the data it expects. For example, instead of building a vocabulary on-the-fly, we assume that the vocabulary has been preextracted and will be read in. Also, we assume that there is one-sentence-per-line in the training input.

REQUIREMENTS: * My python common code: http://github.com/turian/common