mapreduce topic

List mapreduce repositories

BigData-Notes

15.4k
Stars
4.2k
Forks
Watchers

大数据入门指南 :star:

dpark

2.7k
Stars
535
Forks
Watchers

Python clone of Spark, a MapReduce alike framework in Python

BigData-Interview

1.6k
Stars
442
Forks
Watchers

:dart: :star2:[大数据面试题]分享自己在网络上收集的大数据相关的面试题以及自己的答案总结.目前包含Hadoop/Hive/Spark/Flink/Hbase/Kafka/Zookeeper框架的面试题知识总结

alldata

2.5k
Stars
845
Forks
Watchers

🔥🔥 AllData可定义数据中台,以数据平台为底座,以数据中台为桥梁,以机器学习平台为工厂,以大模型应用为上游产品,提供全链路数字化解决方案。采购商业版、加入技术社区:https://docs.qq.com/doc/DVHlkSEtvVXVCd...

bigdata-growth

1.3k
Stars
331
Forks
Watchers

大数据知识仓库涉及到数据仓库建模、实时计算、大数据、数据中台、系统设计、Java、算法等。

BigData

665
Stars
225
Forks
Watchers

💎🔥大数据学习笔记

data-algorithms-book

1.1k
Stars
662
Forks
Watchers

MapReduce, Spark, Java, and Scala for Data Algorithms Book

tdigest

376
Stars
53
Forks
Watchers

t-Digest data structure in Python. Useful for percentiles and quantiles, including distributed enviroments like PySpark

cascading

342
Stars
222
Forks
Watchers

Cascading is a feature rich API for defining and executing complex and fault tolerant data processing flows locally or on a cluster.