RL Target Network

Creator
Creator
Seonglae Cho
Created
Created
2024 Apr 23 18:1
Editor
Edited
Edited
2024 Apr 23 18:2
Refs
Refs

Target network ϕ\phi

Moving target을 방지하기 위해 Delayed updates
Compute targets with target network which don’t change in inner loop
notion image
y=r+λQϕ(s,argmaxaQϕ(s,a))y = r + \lambda Q_{\phi'}(s', argmax_{a'}Q_{\phi'} (s', a'))
 
 
 
 
 
 
 

Recommendations