DoubleDQN

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2024 Apr 23 18:3
Editor
Edited
Edited
2024 May 7 17:14
Refs
Refs

Simple but powerful

Use the current network for action selection and the
RL Target Network
for action evaluation to de-correlate errors in action selection and evaluation
notion image
 
 
 

Clipped Double Q-learning

Learn 2 Q-functions and choose the minimum as target
선택과 가치추정을 분리하는 게 아니라 두 추정값중 최소를 사용하여 전반적으로 줄
 
 
 
 
 

Recommendations