AI Feature

Creator
Creator
Seonglae Cho
Created
Created
2024 Apr 6 13:13
Editor
Edited
Edited
2025 Mar 4 14:31

Write vector

Not just a convenient post-hoc description, some fundamental sense composed of features
Vector written to the residual stream by a node
Notions of Interpretable Feature Learning
 
 
AI Feature Metrics
 
 

Safety relevant feature

Removing features had a greater impact on the model than amplifying features. This suggests that the influence of features may saturate at high activations

Relational composition

How neural networks combine feature vectors to represent complex relationships. Neural nets use vector addition for ordered relationships, vector differences for grammatical relationships, outer products to composite complex structures and interactions, and positional encodings for ID referencing.
 
 
 

Recommendations