MPC
MSE for deterministic model & Log probability for stochastic model.
Usually it is easier than to learn reward function than state change function because people decide rewards.
Replan for Closed-loop system
- MPC yields temporally locally optimal solutions
Utilize the value function for better planning like Monte Carlo Tree Search.
MPC for local planning and a value function approximating globally optimal solution.