Radim Řehůřek
Radim Řehůřek
gensim
Topic Modelling for Humans
sqlitedict
Persistent dict, backed by sqlite3 and pickle, multithread-safe.
smart_open
Utils for streaming large files (S3, HDFS, gzip, bz2...)
gensim-data
Data repository for pretrained NLP models and NLP corpora.
bounter
Efficient Counter that uses a limited (bounded) amount of memory regardless of data size.
gensim-simserver
[NO LONGER MAINTAINED AS OPEN SOURCE - USE SCALETEXT.COM INSTEAD]
sparsesvd
Python wrapper around SVDLIBC, a fast library for sparse Singular Value Decomposition
data_science_python
Source code for the "Practical Data Science in Python" tutorial
sim-shootout
Code for "Performance shootout between nearest-neighbour libraries": http://radimrehurek.com/2013/11/performance-shootout-of-nearest-neighbours-intro
topic_modeling_tutorial
Instructions & code for the EuroPython 2014 training session "Topic Modeling for Fun and Profit"