Texonom
Texonom
/
Engineering
Engineering
/Data Engineering/Artificial Intelligence/Machine Learning/Reinforcement Learning/Reinforcement Learning Term/
Multi-Armed Bandit
Search

Multi-Armed Bandit

Creator
Creator
Seonglae Cho
Created
Created
2025 Feb 23 18:36
Editor
Editor
Seonglae Cho
Edited
Edited
2025 Mar 5 12:36
Refs
Refs

MAB

A single-step (decision point) problem that differs from RL in that it has no state and focuses on finding the balance between exploration and exploitation. Exploit means utilizing items that resulted in high returns, and exploration means trying new items.
 
 
 
 
Multi-armed bandit
In probability theory and machine learning, the multi-armed bandit problem is a problem in which a decision maker iteratively selects one of multiple fixed choices when the properties of each choice are only partially known at the time of allocation, and may become better understood as time passes. A fundamental aspect of bandit problems is that choosing an arm does not affect the properties of the arm or other arms.
Multi-armed bandit
https://en.wikipedia.org/wiki/Multi-armed_bandit
Multi-armed bandit
 
 

Recommendations

Texonom
Texonom
/
Engineering
Engineering
/Data Engineering/Artificial Intelligence/Machine Learning/Reinforcement Learning/Reinforcement Learning Term/
Multi-Armed Bandit
Copyright Seonglae Cho