Texonom
Texonom
/
Engineering
Engineering
/Data Engineering/Artificial Intelligence/AI Problem/AI Alignment/Explainable AI/Interpretable AI/Mechanistic interpretability/Activation Engineering/Transformer Lens/SAELens/
SAELens Steering
Search

SAELens Steering

Creator
Creator
Seonglae Cho
Created
Created
2025 Apr 18 11:6
Editor
Editor
Seonglae Cho
Edited
Edited
2025 Apr 18 11:9
Refs
Refs
steering_vector = sae.W_dec[latent_index]
 
 
 
 
 
 
 
 
 

Recommendations

Texonom
Texonom
/
Engineering
Engineering
/Data Engineering/Artificial Intelligence/AI Problem/AI Alignment/Explainable AI/Interpretable AI/Mechanistic interpretability/Activation Engineering/Transformer Lens/SAELens/
SAELens Steering
Copyright Seonglae Cho