big-data topic
metorikku
A simplified, lightweight ETL Framework based on Apache Spark
poseidon
A search engine which can hold 100 trillion lines of log data.
geni
A Clojure dataframe library that runs on Spark
DataflowJavaSDK
Google Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines.
succinct
Enabling queries on compressed data.
keyvi
Keyvi - the key value index. It is an in-memory FST-based data structure highly optimized for size and lookup performance.
keyvi
Keyvi - a key value index that powers Cliqz search engine. It is an in-memory FST-based data structure highly optimized for size and lookup performance.
fili
Easily make RESTful web services for time series reporting with Big Data analytics engines like Druid and SQL Databases.
spark-movie-lens
An on-line movie recommender using Spark, Python Flask, and the MovieLens dataset