Agent Interpretability

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2025 Nov 1 16:11
Editor
Edited
Edited
2025 Nov 1 16:44
Refs

Action Interpretability, Decision Interpretability

 
 
 
 
 

RNN transition model Interpretability RL

We confirmed that mechanisms very similar to the main components seen in classical search algorithms (plan/search) exist inside the RNN: plan representation, state transition model, and value function.
 

Backlinks

SAE

Recommendations