MAB
A single-step (decision point) problem that differs from RL in that it has no state and focuses on finding the balance between exploration and exploitation. Exploit means utilizing items that resulted in high returns, and exploration means trying new items.