CRL Paper

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2025 Mar 17 15:38
Editor
Edited
Edited
2025 Sep 11 16:34
Refs
Refs

Steering on correlated sae features improve benchmmakrs not only probing

CRL (Control model training with RL), CTRL (Control model Training with RL)

  • SPOT (Sparse Policy Optimization for Circuit control)
  • OSCAR (Optimizing Sparse Circuits via Autoencoder Reinforcement)
declartion file after acc knolwedgement
Control RL Papers
 
 
 
 
 
 
 
 

 

Recommendations