Emb2Emb

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2025 May 10 17:9
Editor
Edited
Edited
2025 Aug 10 21:45
Refs
Refs

Cross-model steering vector extraction

Token embeddings of language models exhibit common geometric structure. Globally, token embeddings often share similar relative orientations. Token embeddings lie on a lower dimensional manifold and tokens with lower intrinsic dimensions often have semantically coherent clusters, while those with higher intrinsic dimensions do not. Also, alignment in token embeddings persists through the
Residual Stream
s of language models.
EMB2EMB is a method that transfer
Steering Vector
from one language model to another. In the unembedding head, weights are learned to map from the source model to the target model, allowing steering vectors to be obtained and applied with coefficients at each layer. Complex feature imitation is possible, allowing steering of features from larger models across different dimensions.
 
 

Recommendations