Write vector
Not just a convenient post-hoc description, some fundamental sense composed of features
Vector written to the residual stream by a node
Notions of Interpretable Feature Learning
AI Feature Metrics
Safety relevant feature
Removing features had a greater impact on the model than amplifying features. This suggests that the influence of features may saturate at high activations
Relational composition
How neural networks combine feature vectors to represent complex relationships. Neural nets use vector addition for ordered relationships, vector differences for grammatical relationships, outer products to composite complex structures and interactions, and positional encodings for ID referencing.