Model-Predictive Control

MSE for deterministic model & Log probability for stochastic model.

Usually it is easier than to learn reward function than state change function because people decide rewards.

Replan for
Closed-loop system

Utilize the value function for better planning like

MPC for local planning and a value function approximating globally optimal solution.