Alpha Go

Creator

Created

2023 Dec 30 16:18

Editor

Edited

2025 Mar 17 1:5

Refs

Monte Carlo Tree Search because simulating space of whole game is too huge and there is a time limit for choosing action.

Use default policy for quick simulation. Default policy takes 1 micro second and tree policy takes 1 ms

Alpha go pre-trained based on human data and applied self-play with Efficient MCTS. Based on the result of self-play, it trains based on the result.

///////