CBET

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2025 Apr 10 23:32
Editor
Edited
Edited
2025 Apr 10 23:42
Refs
Guiding exploration through high intrinsic rewards for unexperienced states or changes (e.g., unexpected events) in
RL Exploration
. Learning exploration methods (exploration policies) across various environments through intrinsic rewards. These learned exploration policies can then be applied directly to new environments through
Transfer Learning
In complex environments with sparse rewards, it efficiently explores to capture important states and changes even in situations with rare external rewards, recording higher extrinsic rewards than the standard version. However, in simple environments, exploration behavior becomes excessively induced, leading to actions that don't align with goals, resulting in convergence to suboptimal policies with low extrinsic rewards.
 
 
 
 
2021 nips
2025 nldl to
Model based RL
such as
Dreamer
V3,
IMPALA
 
 

Recommendations