dimensionality-reduction-for-sparse-binary-data icon indicating copy to clipboard operation
dimensionality-reduction-for-sparse-binary-data copied to clipboard

convert a lot of zeros and ones to fewer real numbers

Dimensionality reduction for sparse binary data

See http://fastml.com/dimensionality-reduction-for-sparse-binary-data/ for description.

adult_results.txt - results of testing on _adult_ dataset
batch.txt - a batch file of commands for conversion
csv_output_snippet.py - how to output csv from gensim
first.py - extract some lines from a file, see batch.txt

gensim_add_labels.py - add labels (lost during conversion)
gensim_lda.py - perform LDA conversion
gensim_lsi.py - perform LSI conversion
gensim_rp.py - perform random projections conversion
gensim_tfidf.py - perform TF-IDF preprocessing

libsvm2csv.py - convert libsvm file to csv
rf.r - random forest code used for testing

spams_nmf.py - perform NMF conversion. Requires SPAMS and scikit-learn for tf-idf.