CSD

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2024 May 22 2:15
Editor
Edited
Edited
2025 Jan 19 14:58
Refs
Refs

Controllability-aware Skill Discovery

We always Seek new things

Learn what are easy-to-control states and hard-to-control states → More rewarded when changing hard-to-control states (low probability and large distance in skill space)
Shrink the skill space to less reward for easy-to-control states. (less reward for easily controllable skills)

LSD with log probability

Learn what are easy-to-control states and hard-to-control states by
  • hard transition → small p(s′ ∣ s) → large ∥ϕ(s′) − ϕ(s)∥
  • easy transition → high p(s′ ∣ s) → small ∥ϕ(s′) − ϕ(s)∥
To explain it more simply, this means we only do LSD with data that satisfies the above condition - for easy cases, we only allow those that are close in z-space, while for difficult cases, we allow those that are far away in z-space
notion image
 
 
 
Controllability-Aware Unsupervised Skill Discovery
One of the key capabilities of intelligent agents is the ability to discover useful skills without external supervision. However, the current unsupervised skill discovery methods are often limited...
Controllability-Aware Unsupervised Skill Discovery
 
 
 

Recommendations