machine-learning-clustering-retrieval icon indicating copy to clipboard operation
machine-learning-clustering-retrieval copied to clipboard

Built text and image clustering models using unsupervised machine learning algorithms such as nearest neighbors, k means, LDA , and used techniques such as expectation maximization, locality sensitive...

Machine Learning Clustering and Retrieval: Text and Image Clustering Models

Description

  • Built wikipedia article and image retrieval models by using clustering algorithms such as k nearest neighbors, k means, latent dirichlet allocation, and hierarchical clustering.
  • Used expectation maximization, locality sensitive hashing, and gibbs sampling to built gaussian mixture and mixed membership models for an improved assignment of data-points and clustering.

Code

  1. Nearest Neighbors Search
  2. 1 Nearest Neighbor with Locality Sensitive Hashing
  3. K Means
  4. Expectation Maximization
  5. Expectation Maximization - Image Data (Gaussian Mixtures)
  6. Latent Dirichlet Allocation - Mixed Membership Model
  7. Hierarchical Clustering

Data

Programming Language

Python

Packages

Anaconda, Graphlab Create Installation guide

Tools/IDE

Jupyter notebook (IPython)

How to use it

  1. Fork this repository to have your own copy
  2. Clone your copy on your local system
  3. Install necessary packages

Note

This repository does not contain optimal machine learning models! It only assesses various models that can be built using different machine learning algorithms (either implemented or used directly from Graphlab Create package) to perform different tasks.