Apache Storm
Category
•
Definition
Apache Storm is a free and open-source distributed real-time computation system. It makes it easy to reliably process unbounded streams of data, doing real-time processing what Hadoop did for batch processing.
Key characteristics include:
- Real-time Processing: Processes data as it arrives with minimal latency
- Fault Tolerant: Automatically restarts failed processes
- Horizontal Scalability: Can scale to handle increasing data volumes
- Language Agnostic: Supports multiple programming languages
- Guaranteed Processing: Ensures every message is processed at least once
Storm topologies consist of spouts (data sources) and bolts (processing logic) connected in a directed acyclic graph. It's commonly used for real-time analytics, online machine learning, continuous computation, and ETL operations on streaming data.
tl;dr
A distributed real-time computation system for processing unbounded streams of data.