Guiding exploration through high intrinsic rewards for unexperienced states or changes (e.g., unexpected events) in RL Exploration. Learning exploration methods (exploration policies) across various environments through intrinsic rewards. These learned exploration policies can then be applied directly to new environments through Transfer Learning
In complex environments with sparse rewards, it efficiently explores to capture important states and changes even in situations with rare external rewards, recording higher extrinsic rewards than the standard version. However, in simple environments, exploration behavior becomes excessively induced, leading to actions that don't align with goals, resulting in convergence to suboptimal policies with low extrinsic rewards.
2021 nips
proceedings.neurips.cc
https://proceedings.neurips.cc/paper/2021/file/abe8e03e3ac71c2ec3bfb0de042638d8-Paper.pdf
World Model Agents with Change-Based Intrinsic Motivation
Sparse reward environments pose a significant challenge for reinforcement learning due to the scarcity of feedback. Intrinsic motivation and transfer learning have emerged as promising strategies...
https://openreview.net/forum?id=0io7gvXniL#discussion


Seonglae Cho