LSD
Take changes in state distance into account for more complex and dynamic skills by incentivizing challenging behaviors.
Align the direction of skill vector and state change.
LSD adds a distance consideration to learning skill policy and by adding a term to maximize : and regulates to reflect distance in : preventing from becoming infinitely large.
Limitation
euclidan distance based does not always consistent
Lipschitz-constrained Unsupervised Skill Discovery
We study the problem of unsupervised skill discovery, whose goal is to learn a set of diverse and useful skills with no external reward. There have been a number of skill discovery methods based...
https://arxiv.org/abs/2202.00914


Seonglae Cho