A type of machine learning where agents learn optimal actions through trial and error and reward feedback. Powers game-playing AI and autonomous systems.