SADI

Generates dynamic steering vectors that reflect different semantics for each input

Calculate activation differences between contrastive pairs (positive vs negative) → identify important components (attention heads, hidden states, neurons).

During inference, apply element-wise scaling according to the input's activation → results in semantically appropriate intervention (direction).

Simply element-wise masking with contrastive dataset like