CSD

Created
Created
2024 May 22 2:15
Editor
Creator
Creator
Seonglae ChoSeonglae Cho
Edited
Edited
2024 Jun 18 14:18
Refs
Refs

Controllability-aware Skill Discovery

We always Seek new things

Learn what are easy-to-control states and hard-to-control states → More rewarded when changing hard-to-control states (low probability and large distance in skill space)
Shrink the skill space to less reward for easy-to-control states. (less reward for easily controllable skills)

LSD with log probability

Learn what are easy-to-control states and hard-to-control states by
  • hard transition → small p(s′ ∣ s) → large ∥ϕ(s′) − ϕ(s)∥
  • easy transition → high p(s′ ∣ s) → small ∥ϕ(s′) − ϕ(s)∥
풀어서 설명하면, 위 조건을 만족하는 데이터로만 LSD 한다는 건데, 쉬운 놈은 z space 에서 가까운 것만 어용해주고 어려운 놈은 z space 에서 먼놈도 허용해준다는 말
notion image
 
 
 

Recommendations