big-data topic

List big-data repositories

ParquetViewer

666
Stars
78
Forks
Watchers

Simple windows desktop application for viewing & querying Apache Parquet files

HaloDB

490
Stars
101
Forks
Watchers

A fast, log structured key-value store.

SparkRDMA

244
Stars
70
Forks
Watchers

This is archive of SparkRDMA project. The new repository with RDMA shuffle acceleration for Apache Spark is here: https://github.com/Nvidia/sparkucx

AutoDL

1.1k
Stars
215
Forks
Watchers

Automated Deep Learning without ANY human intervention. 1'st Solution for AutoDL challenge@NeurIPS.

selinon

298
Stars
33
Forks
Watchers

An advanced distributed task flow management on top of Celery

SGDLibrary

210
Stars
86
Forks
Watchers

MATLAB/Octave library for stochastic optimization algorithms: Version 1.0.20

mobydq

245
Stars
59
Forks
Watchers

:whale: Tool to automate data quality checks on data pipelines

PGM-index

768
Stars
88
Forks
Watchers

šŸ…State-of-the-art learned data structure that enables fast lookup, predecessor, range searches and updates in arrays of billions of items using orders of magnitude less space than traditional indexes

hyperspace

422
Stars
114
Forks
Watchers

An open source indexing subsystem that brings index-based query acceleration to Apache Sparkā„¢ and big data workloads.

belajarpython.com

360
Stars
693
Forks
Watchers

Open Source Indonesian Python Programming Tutorial Site