DAS

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2024 Nov 27 17:24
Editor
Edited
Edited
2025 Nov 10 18:17

distributed alignment search

Learn
Orthogonal Matrix
of activation layer to transform activation layer. They use interchange intervention to infer high-level causal abstraction to optimize alignment. It more focuses on distributed representation rather than
SAE
trying to decompose each into features mono-semantically.
They rotate basis of activation vector to identify high-level causal variable but there is a limit due to the
Superposition Hypothesis
with same-sized dimension.
 
 
 
 
 

Recommendations