Entropy Bonus

Creator

Creator

Seonglae Cho

Created

Created

2024 May 1 4:34

Editor

Editor

Seonglae Cho

Edited

Edited

2024 Jun 16 11:10

Refs

Refs

Entropy Regularization scaled by a temperature coefficient

Prevent deterministic policy

Exploration noise in continuous spaces like

Note that maximizing entropy requires differentiating through the sampling distribution. We can do this via the “re-parametrization trick”.

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement...

Model-free deep reinforcement learning (RL) algorithms have been demonstrated on a range of challenging decision making and control tasks. However, these methods typically suffer from two major...

https://arxiv.org/abs/1801.01290

Recommendations

////////