AlgorithmsOnSpark icon indicating copy to clipboard operation
AlgorithmsOnSpark copied to clipboard

Some popular algorithms(dbscan,knn,fm etc.) on spark

Distributed Algorithms On Spark

This project implement some popular algorithms on spark.You can read the papers of them to see their details.

Currently it support the following algorithms and I will add some other algorithms in the future.

  • Distributed KNN
  • Down Sampling
  • Over Sampling
  • Affinity Propagation
  • Distributed t-SNE
  • Factorization Machines
  • Multi-view Machines
  • Block Structures Factorization Machines
  • Timeseries models
  • DBSCAN

This project support spark 2.x

reference

  • https://github.com/viirya/SparkAffinityPropagation
  • https://github.com/saurfang/spark-tsne
  • https://github.com/cloudml/zen
  • https://github.com/sryza/spark-timeseries
  • https://github.com/irvingc/dbscan-on-spark
  • http://mlwiki.org/index.php/Metric_Trees