CSD

Creator
Creator
Seonglae Cho
Created
Created
2024 May 22 2:15
Editor
Edited
Edited
2025 Jan 19 14:58
Refs
Refs

Controllability-aware Skill Discovery

We always Seek new things

Learn what are easy-to-control states and hard-to-control states → More rewarded when changing hard-to-control states (low probability and large distance in skill space)
Shrink the skill space to less reward for easy-to-control states. (less reward for easily controllable skills)

LSD with log probability

Learn what are easy-to-control states and hard-to-control states by
  • hard transition → small p(s′ ∣ s) → large ∥ϕ(s′) − ϕ(s)∥
  • easy transition → high p(s′ ∣ s) → small ∥ϕ(s′) − ϕ(s)∥
To explain it more simply, this means we only do LSD with data that satisfies the above condition - for easy cases, we only allow those that are close in z-space, while for difficult cases, we allow those that are far away in z-space
notion image
 
 
 
 
 
 

Recommendations