DAS

Creator
Creator
Seonglae Cho
Created
Created
2024 Nov 27 17:24
Editor
Edited
Edited
2024 Nov 27 17:35

distributed alignment search

Learn
Orthogonal Matrix
of activation layer to transform activation layer. They use interchange intervention to infer high-level causal abstraction to optimize alignment. It more focuses on distributed representation rather than
Neuron SAE
trying to decompose each into features mono-semantically.
They rotate basis of activation vector to identify high-level causal variable but there is a limit due to the
Superposition Hypothesis
with same-sized dimension.
 
 
 
 

Recommendations