Cross-Entropy Method, Elite Iteration
Sample mean and variance of elite samples are minimizing CE between the current sampling distribution and target distribution
CEM Iteration (more planning → better performance) 5 is a good point.
Set action distribution and update distribution parameters re-fit distribution using top-K elite actions

CEM can generate the action only by planning through the model, without any explicit policy.
Random shooting
Guess & check without elite fitting. We improved this distribution by the selected elites.
Cross-entropy method
The cross-entropy (CE) method is a Monte Carlo method for importance sampling and optimization. It is applicable to both combinatorial and continuous problems, with either a static or noisy objective.
https://en.wikipedia.org/wiki/Cross-entropy_method

Seonglae Cho