CEM Iteration

Sample mean and variance of elite samples are minimizing CE between the current sampling distribution and target distribution

CEM Iteration (more planning → better performance) 5 is a good point.

Set action distribution and update distribution parameters re-fit distribution using top-K elite actions

CEM can generate the action only by planning through the model, without any explicit policy.

Guess & check without elite fitting. We improved this distribution by the selected elites.