Thompson sampling

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2025 Jul 10 20:41
Editor
Edited
Edited
2025 Jul 10 20:45
Refs
Refs
When you have multiple options (e.g., several slot machines, advertisements, types of medicine, etc.) and don't know the success probability of each, this is a method to experimentally find "which one works best?"
It manages each option's success probability as a "probability distribution" (e.g., beta distribution), then randomly samples one probability from each option's distribution. Based on the results, it updates the distribution for that option.
 
 
 
 
 
 

Backlinks

AB-MCTS

Recommendations