hadoop-hdfs topic
cubefs
cloud-native distributed storage
seaweedfs
SeaweedFS is a fast distributed storage system for blobs, objects, files, and data lake, for billions of files! Blob store has O(1) disk seek, cloud tiering. Filer supports Cloud Drive, cross-DC activ...
MorphL-Community-Edition
MorphL Community Edition uses big data and machine learning to predict user behaviors in digital products and services with the end goal of increasing KPIs (click-through rates, conversion rates, etc....
dynamometer
A tool for scale and performance testing of HDFS with a specific focus on the NameNode.
data-engineering-interview-questions
More than 2000+ Data engineer interview questions.
sparksql-for-hbase
Learn how to use Spark SQL and HSpark connector package to create / query data tables that reside in HBase region servers
datapipelines-essentials-python
Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformati...
Big_DataHadoop_Projects
Big data projects implemented by Maniram yadav
console
Open source data infrastructure platform. Designed for developers, built for speed.
TravelWebsite_BigDataAnalysis
旅游网站(携程网部分数据)大数据分析-hadoop课程设计(本科课设级别)