SEA

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2024 Oct 26 16:4
Editor
Edited
Edited
2025 Aug 10 21:41
Refs
Refs

Spectral Editing of Activations

Proposes a method to modify internal LLM activations with the goals of enhancing factuality and reducing bias
  • Preserves directions with high covariance with positive attributes
  • Removes directions with high covariance with negative attributes
Computed using SVD-based spectral decomposition, applied during inference by projecting/restoring at the last few Transformer layers.
 
 
 
 
 
 
 

Recommendations