AI

Apache Storm

Category

Definition

Apache Storm is a free and open-source distributed real-time computation system. It makes it easy to reliably process unbounded streams of data, doing real-time processing what Hadoop did for batch processing.

Key characteristics include:

  • Real-time Processing: Processes data as it arrives with minimal latency
  • Fault Tolerant: Automatically restarts failed processes
  • Horizontal Scalability: Can scale to handle increasing data volumes
  • Language Agnostic: Supports multiple programming languages
  • Guaranteed Processing: Ensures every message is processed at least once

Storm topologies consist of spouts (data sources) and bolts (processing logic) connected in a directed acyclic graph. It's commonly used for real-time analytics, online machine learning, continuous computation, and ETL operations on streaming data.

tl;dr
A distributed real-time computation system for processing unbounded streams of data.