Texonom
Texonom
/
Engineering
Engineering
/Data Engineering/Artificial Intelligence/AI Problem/AI Alignment/Explainable AI/Interpretable AI/Mechanistic interpretability/Activation Engineering/Neuron SAE/SAE Limitation/
SAE Dead Feature
Search

SAE Dead Feature

Creator
Creator
Seonglae Cho
Created
Created
2024 Oct 24 0:16
Editor
Editor
Seonglae Cho
Edited
Edited
2025 Apr 6 19:0
Refs
Refs
Lottery ticket hypothesis
  • Neuron resampling
  • Ghost Gradient
  • Auxiliary-K loss
  • JumpReLU SAE
    pre-act loss with
    Straight-through estimator
  • SAE weight initialization

Factors

  • increasing size of dictionary size increase dead neurons
 
 
 
Open Source Sparse Autoencoders for all Residual Stream Layers of GPT2-Small — LessWrong
Browse these SAE Features on Neuronpedia!  …
Open Source Sparse Autoencoders for all Residual Stream Layers of GPT2-Small — LessWrong
https://www.lesswrong.com/posts/f9EgfLSurAiqRJySD/open-source-sparse-autoencoders-for-all-residual-stream
Open Source Sparse Autoencoders for all Residual Stream Layers of GPT2-Small — LessWrong

Absent feature

arxiv.org
https://arxiv.org/pdf/2410.14670
 
 

Backlinks

SAE LimitationMFRGradient SAE

Recommendations

Texonom
Texonom
/
Engineering
Engineering
/Data Engineering/Artificial Intelligence/AI Problem/AI Alignment/Explainable AI/Interpretable AI/Mechanistic interpretability/Activation Engineering/Neuron SAE/SAE Limitation/
SAE Dead Feature
Copyright Seonglae Cho