Delta Action Learning

Creator
Creator
Seonglae Cho
Created
Created
2025 Feb 9 11:22
Editor
Edited
Edited
2025 Feb 9 11:57
Refs
Refs
DeltaDynamics
corrects delta using st+1=fsim(st,at)+fΔ(st,at)s_{t+1} = f_{sim}(s_t, a_t) + f_\Delta(s_t, a_t) while
ASAP method
tunes action based on st+1=fsim(st,at+πΔ(st,at))s_{t+1} = f_{sim}(s_t, a_t + \pi_\Delta(s_t, a_t)). DeltaDynamics compensates residual dynamics while ASAP treats action directly which is effective to reduce
Compounding Error
.
 
 
 
 
 
 

Recommendations