hdfs topic

List hdfs repositories

Spark-with-Python

324
Stars
259
Forks
Watchers

Fundamentals of Spark with Python (using PySpark), code examples

rumble

208
Stars
80
Forks
Watchers

⛈️ RumbleDB 1.21.0 "Hawthorn blossom" 🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to d...

kafka-connect-ui

496
Stars
131
Forks
Watchers

Web tool for Kafka Connect |

hdfs-shell

150
Stars
33
Forks
Watchers

HDFS Shell is a HDFS manipulation tool to work with functions integrated in Hadoop DFS

ElasticCTR

179
Stars
45
Forks
Watchers

ElasticCTR,即飞桨弹性计算推荐系统,是基于Kubernetes的企业级推荐系统开源解决方案。该方案融合了百度业务场景下持续打磨的高精度CTR模型、飞桨开源框架的大规模分布式训练能力、工业级稀疏参数弹性调度服务,帮...

smart_open

3.1k
Stars
379
Forks
Watchers

Utils for streaming large files (S3, HDFS, gzip, bz2...)

NNAnalytics

109
Stars
71
Forks
Watchers

NameNodeAnalytics is a self-help utility for scouting and maintaining the namespace of an HDFS instance.

HsunTzu

134
Stars
38
Forks
Watchers

HDFS compress tar zip snappy gzip uncompress untar codec hadoop spark

bigdata-file-viewer

282
Stars
54
Forks
Watchers

A cross-platform (Windows, MAC, Linux) desktop application to view common bigdata binary format like Parquet, ORC, AVRO, etc. Support local file system, HDFS, AWS S3, Azure Blob Storage ,etc.