awesome-kafka icon indicating copy to clipboard operation
awesome-kafka copied to clipboard

A list about Apache Kafka

awesome-kafka

This list is for anyone wishing to learn about Apache Kafka, but do not have a starting point.

You can help by sending Pull Requests to add more information.

If you're not inclined to make PRs, you can tweet me at @infoslack

Table of Contents

  • Articles
  • Books
  • Online Courses
  • Papers
  • Presentations
  • Talks
  • Tools

Articles

  • http://blog.cloudera.com/blog/2014/09/apache-kafka-for-beginners/ - Apache Kafka for Beginners
  • http://sysadvent.blogspot.com.br/2014/12/day-4-introduction-to-kafka.html - Introduction to Kafka
  • https://medium.com/@ruurtjan/understanding-kafka-with-factorio-74e8fc9bf181 - Understanding Kafka with Factorio
  • http://www.confluent.io/blog/introducing-kafka-streams-stream-processing-made-simple - Introducing Kafka Streams: Stream Processing Made Simple
  • https://medium.com/swlh/apache-kafka-in-a-nutshell-5782b01d9ffb - Apache Kafka in a Nutshell
  • http://www.confluent.io/blog/apache-kafka-samza-and-the-unix-philosophy-of-distributed-data - Apache Kafka, Samza, and the Unix Philosophy of Distributed Data
  • http://www.confluent.io/blog/using-logs-to-build-a-solid-data-infrastructure-or-why-dual-writes-are-a-bad-idea/ - Using logs to build a solid data infrastructure (or: why dual writes are a bad idea)
  • http://www.confluent.io/blog/bottled-water-real-time-integration-of-postgresql-and-kafka/ - Bottled Water: Real-time integration of PostgreSQL and Kafka
  • http://www.confluent.io/blog/deploying-apache-kafka-on-aws-elastic-block-store-ebs - Deploying Apache Kafka on AWS Elastic Block Store (EBS)
  • http://www.confluent.io/blog/how-we-monitor-and-run-kafka-at-scale-signalfx - How We Monitor and Run Kafka At Scale
  • https://www.datadoghq.com/blog/monitoring-kafka-performance-metrics/ - Monitoring Kafka performance metrics
  • http://oobaloo.co.uk/kafka-for-uswitchs-event-pipeline - Kafka for uSwitch's Event Pipeline
  • http://blog.infochimps.com/2012/10/30/next-gen-real-time-streaming-storm-kafka-integration/ - Next Gen Real-time Streaming with Storm-Kafka Integration
  • https://medium.com/netflix-techblog/search?q=kafka - all netflix tech blog kafka tagged articles
  • https://engineering.linkedin.com/blog/topic/kafka - all linked in tech blog kafka tagged articlesEEEEE
  • https://medium.com/@stephane.maarek/how-to-use-apache-kafka-to-transform-a-batch-pipeline-into-a-real-time-one-831b48a6ad85 - How to use Apache Kafka to transform a batch pipeline into a real-time one
  • https://dev.to/barryosull/event-sourcing-what-it-is-and-why-its-awesome - Event Sourcing: What it is and why it's awesome
  • https://www.confluent.io/blog/getting-started-with-rust-and-kafka - Using Kafka with Rust
  • https://medium.com/@gayathrisiva8608/apache-kafka-is-a-distributed-streaming-system-that-can-publish-and-subscribe-a-stream-of-records-12af92e2d491 - Apache Kafka-Intro
  • https://www.loginradius.com/engineering/blog/stream-processing-using-kafka/ - Introduction to Stream Processing using Kafka Streams

Books

  • http://www.confluent.io/making-sense-of-stream-processing-ebook - Making sense of stream processing
  • http://shop.oreilly.com/product/0636920032175.do - Designing Data-Intensive Applications
  • http://apachekafkabook.com/ - Effective Kafka - A Hands-On Guide to Building Robust and Scalable Event-Driven Applications
  • https://www.confluent.io/wp-content/uploads/confluent-kafka-definitive-guide-complete.pdf - Kafka: The Definitive Guide
  • https://www.manning.com/books/kafka-in-action - A fast-paced introduction to every aspect of working with Kafka you need to really reap its benefits.
  • https://www.manning.com/books/kafka-streams-in-action - By the end of the book, you'll be ready to use Kafka Streams in your projects to reap the benefits of the insight your data holds quickly and easily.
  • https://www.manning.com/books/kafka-streams-in-action-second-edition - The book that teaches you to implement stream processing within the Kafka platform.

Courses

Papers

  • http://www.vldb.org/pvldb/vol8/p1654-wang.pdf - Building a Replicated Logging System with Apache Kafka
  • http://research.microsoft.com/en-us/um/people/srikanth/netdb11/netdb11papers/netdb11-final12.pdf - Kafka: a Distributed Messaging System for Log Processing
  • http://sites.computer.org/debull/A12june/pipeline.pdf - Building LinkedIn’s Real-time Activity Data Pipeline
  • https://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying - The Log: What every software engineer should know about real-time data's unifying abstraction

Presentations

  • http://www.slideshare.net/charmalloc/developing-with-the-go-client-for-apache-kafka - Developing with the Go client for Apache Kafka
  • http://www.slideshare.net/miguno/apache-kafka-08-basic-training-verisign - Apache Kafka 0.8 basic training - Verisign
  • http://www.slideshare.net/charmalloc/developingwithapachekafka-29910685 - Developing Real-Time Data Pipelines with Apache Kafka
  • http://www.slideshare.net/AmazonWebServices/infrastructure-at-scale-apache-kafka-twitter-storm-elastic-search-arc303-aws-reinvent-2013 - Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303) | AWS re:Invent 2013
  • http://www.slideshare.net/charmalloc/real-timestreamingdata-pipelinesapachekafka - Real-time streaming and data pipelines with Apache Kafka
  • http://www.slideshare.net/Hadoop_Summit/building-a-realtime-data-pipeline-apache-kafka-at-linkedin - Building a Real-time Data Pipeline: Apache Kafka at LinkedIn
  • http://www.slideshare.net/junrao/kafka-replication-apachecon2013 - Kafka replication apachecon_2013
  • http://www.slideshare.net/mumrah/kafka-talk-tri-hug - Introduction and Overview of Apache Kafka

Talks

  • https://www.youtube.com/watch?v=qc33qMUvR7c - Introduction to Apache Kafka by Joe Stein
  • https://www.youtube.com/watch?v=9RMOc0SwRro - Apache Kafka and the Next 700 Stream Processing Systems by Jay Kreps
  • https://martin.kleppmann.com/2015/05/27/logs-for-data-infrastructure.html - Transcript of Martin Kleppmann (Linkedin) - Using Logs To Build a Solid Data Infrastructure
  • https://www.youtube.com/watch?v=aJuo_bLSW6s - I ♥ Logs: Apache Kafka and Real-Time Data Integration
  • https://www.youtube.com/watch?v=InAKDEk7H0M - Apache Kafka: Real-time Streaming and Data Pipelines with Apache Kafka by Joe Stein
  • https://vimeo.com/63040812 - AJUG - Apache Kafka - Chris Curtin
  • https://www.youtube.com/playlist?list=PLkz1SCf5iB4enAR00Z46JwY9GGkaS2NON - Getting started guide on Apache Kafka by Learning Journal
  • https://youtu.be/qoeuufklgUE - the basics of what a KStream, KTable, and GlobalKTable

Tools

  • Firehose - Firehose is an extensible, no-code, and cloud-native service to load real-time streaming data from Kafka to data stores, data lakes, and analytical storage systems.

  • Dagger - Dagger is an easy-to-use, SQL framework to process built on top of Apache Flink for stateful processing of Kafka data.

  • Raccoon - Raccoon is a high-throughput, low-latency service to collect events in real-time from your web, mobile apps, and services using multiple network protocols and publish to Kafka.

  • Stencil - Stencil is a Protobuf schema registry for Kafka that enables you to create, maintain and consume protobuf messages and APIs dynamically, efficiently, and reliably.

  • Meteor - Meteor is a scalable, easy-to-use, extensible metadata collection framework from the different cloud providers and on-prem sources and publish to Kafka.

  • Logit.io - Logit.io allows you to send logs and metrics from Kafka for centralised monitoring, alerting and analysis.

  • https://github.com/Microsoft/Availability-Monitor-for-Kafka

  • https://github.com/linkedin/Burrow

  • https://github.com/splee/burrower

  • https://github.com/yahoo/kafka-manager

  • https://github.com/tchiotludo/kafkahq

  • https://github.com/SourceLabOrg/kafka-webview

  • http://www.kafkatool.com/

  • https://github.com/kafka-ops/kafka-topology-builder Gitops and Automation for Apache Kafka

  • Strimzi Operator for deploying and running Apache Kafka on Kubernetes and OpenShift

  • kafkacat Generic CLI non-JVM Apache Kafka producer and consumer

  • connectctl Manage kafka connect connectors easily

  • Kafdrop Web UI for browsing Kafka topics and consumer groups

  • KokpitBasic Kafka client that does publish subscribe and runs as a Windows application

  • Strimzi Kafka CLI A CLI for Strimzi Kafka Operator

  • Kowl A modern Kafka WebUI written in Go & React