Yusuf Ganiyu

Results 9 repositories owned by Yusuf Ganiyu

e2e-data-engineering

180
Stars
83
Forks
Watchers

An end-to-end data engineering pipeline that orchestrates data ingestion, processing, and storage using Apache Airflow, Python, Apache Kafka, Apache Zookeeper, Apache Spark, and Cassandra. All compone...

changecapture-e2e

30
Stars
15
Forks
Watchers

This project shows how to capture changes from postgres database and stream them into kafka

RedditDataEngineering

55
Stars
38
Forks
Watchers

This project provides a comprehensive data pipeline solution to extract, transform, and load (ETL) Reddit data into a Redshift data warehouse. The pipeline leverages a combination of tools and service...

FlinkCommerce

33
Stars
16
Forks
Watchers

This repository contains an Apache Flink application for real-time sales analytics built using Docker Compose to orchestrate the necessary infrastructure components, including Apache Flink, Elasticsea...

RealtimeStreamingEngineering

26
Stars
16
Forks
Watchers

This project serves as a comprehensive guide to building an end-to-end data engineering pipeline using TCP/IP Socket, Apache Spark, OpenAI LLM, Kafka and Elasticsearch. It covers each stage from data...

realtime-voting-data-engineering

26
Stars
18
Forks
Watchers

This repository contains the code for a realtime election voting system. The system is built using Python, Kafka, Spark Streaming, Postgres and Streamlit. The system is built using Docker Compose to e...

modern-data-eng-dbt-databricks-azure

21
Stars
10
Forks
Watchers

In this project, we setup and end to end data engineering using Apache Spark, Azure Databricks, Data Build Tool (DBT) using Azure as our cloud provider.

FootballDataEngineering

16
Stars
14
Forks
Watchers

An end-to-end data engineering pipeline that fetches data from Wikipedia, cleans and transforms it with Apache Airflow and saves it on Azure Data Lake. Other processing takes place on Azure Data Facto...

SparkingFlow

25
Stars
17
Forks
Watchers

This project demonstrates how to use Apache Airflow to submit jobs to Apache spark cluster in different programming laguages using Python, Scala and Java as an example.