SEA

Creator

Creator

Seonglae Cho

Created

Created

2024 Oct 26 16:4

Editor

Editor

Seonglae Cho

Edited

Edited

2025 Aug 10 21:41

Refs

Refs

Spectral Editing of Activations

Proposes a method to modify internal LLM activations with the goals of enhancing factuality and reducing bias

Preserves directions with high covariance with positive attributes

Removes directions with high covariance with negative attributes

Computed using SVD-based spectral decomposition, applied during inference by projecting/restoring at the last few Transformer layers.

https://arxiv.org/pdf/2405.09719

Recommendations

////////