PLDM

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2025 Jul 20 0:2
Editor
Edited
Edited
2025 Jul 20 0:8

Planning with a Latent Dynamics Model

This method involves training a
Latent dynamics model
and then performing
Model-based Planning
(MPC/MPPI) on top of it. In other words, instead of directly optimizing a policy network, actions are selected through planning at each timestep within a "learned world model".
We need to train agents that work well across multiple goals and new environments using only offline state-action trajectories without rewards (labels). By using
Joint Embedding Predictive Architecture
to learn a latent dynamics model followed by planning, we can generate action sequences that minimize the distance between current states and goals in latent space (or redefine costs according to tasks). This allows immediate transfer to various goals, new layouts, and new tasks without reward annotations.
 
 
 
 
 

ICML 2025 Best Paper

 
 
 

Recommendations