big-data topic
ParquetViewer
Simple windows desktop application for viewing & querying Apache Parquet files
HaloDB
A fast, log structured key-value store.
SparkRDMA
This is archive of SparkRDMA project. The new repository with RDMA shuffle acceleration for Apache Spark is here: https://github.com/Nvidia/sparkucx
AutoDL
Automated Deep Learning without ANY human intervention. 1'st Solution for AutoDL challenge@NeurIPS.
selinon
An advanced distributed task flow management on top of Celery
SGDLibrary
MATLAB/Octave library for stochastic optimization algorithms: Version 1.0.20
mobydq
:whale: Tool to automate data quality checks on data pipelines
PGM-index
š State-of-the-art learned data structure that enables fast lookup, predecessor, range searches and updates in arrays of billions of items using orders of magnitude less space than traditional indexes
hyperspace
An open source indexing subsystem that brings index-based query acceleration to Apache Spark⢠and big data workloads.
belajarpython.com
Open Source Indonesian Python Programming Tutorial Site