AI

Apache Kafka

Category

Definition

Apache Kafka is an open-source distributed event streaming platform capable of handling trillions of events a day. It provides a unified, high-throughput, low-latency platform for handling real-time data feeds and building real-time streaming data pipelines.

Core concepts include:

  • Topics: Categories or feed names to which messages are published
  • Producers: Applications that publish messages to topics
  • Consumers: Applications that subscribe to topics and process messages
  • Brokers: Kafka servers that store and serve data
  • Partitions: Parallel processing units within topics

Kafka is widely used for building real-time streaming data pipelines, stream processing applications, event sourcing architectures, log aggregation, and as a messaging system for microservices communication.

tl;dr
An open-source distributed event streaming platform for building real-time data pipelines.