UMAP
Browsing code error feature
Layer wise visualized analysis
- SAE's reconstruction performance degrades sharply when exceeding the training context length
- In short contexts, performance worsens in later layers, while in long contexts, early layers show degraded performance
- While most SAE feature steering negatively impacts model performance, some features lead to improvements
- Errors in early layer SAEs negatively affect the performance of later layers
cos sim