big-data topic

List big-data repositories

img2dataset

3.3k
Stars
316
Forks
Watchers

Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.

iotdb

5.5k
Stars
998
Forks
Watchers

Apache IoTDB

books

739
Stars
270
Forks
Watchers

整理一些书籍 ,包含 C&C++ 、git 、Java、Keras 、Linux 、NLP 、Python 、Scala 、TensorFlow 、大数据 、推荐系统、数据库、数据挖掘 、机器学习 、深度学习 、算法等。

couchdb-fauxton

377
Stars
225
Forks
Watchers

Fauxton is the new Web UI for CouchDB

matano

1.4k
Stars
91
Forks
Watchers

Open source security data lake for threat hunting, detection & response, and cybersecurity analytics at petabyte scale on AWS

amazon-s3-find-and-forget

233
Stars
36
Forks
Watchers

Amazon S3 Find and Forget is a solution to handle data erasure requests from data lakes stored on Amazon S3, for example, pursuant to the European General Data Protection Regulation (GDPR)

apex-core

351
Stars
176
Forks
Watchers

Mirror of Apache Apex core

data-science-live-book

217
Stars
112
Forks
Watchers

An open source book to learn data science, data analysis and machine learning, suitable for all ages!

couchdb-docker

249
Stars
131
Forks
Watchers

Semi-official Apache CouchDB Docker images

qbeast-spark

199
Stars
17
Forks
Watchers

Qbeast-spark: DataSource enabling multi-dimensional indexing and efficient data sampling. Big Data, free from the unnecessary!