The reconstructed activation vector causes much larger errors in next token prediction compared to random vectors at the same distance from the original vector. In other words, the reconstructed vector has a systematic and abnormal negative impact on model performance, making it distinct from simple noise or random errors
SAE pathological error
Creator
Creator

Created
Created
2024 Nov 19 22:34Editor
Editor

Edited
Edited
2025 Jan 8 20:49Refs
Refs