Texonom
Texonom
/
Engineering
Engineering
/Data Engineering/Artificial Intelligence/AI Problem/AI Alignment/Explainable AI/Interpretable AI/Mechanistic interpretability/Weight Interpretability/
Meta SAE
Search

Meta SAE

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2025 Jan 28 13:49
Editor
Editor
Seonglae ChoSeonglae Cho
Edited
Edited
2025 Mar 8 12:28
Refs
Refs
Neuron SAE

Meta SAE features on SAE decoder

MetaSAEs are sparse autoencoders (SAEs) trained on the decoder directions (y-axis) of another SAE.
notion image

Meta Latents

https://arxiv.org/pdf/2502.04878
 
 
 

MetaSAE

openreview.net
https://openreview.net/pdf?id=9ca9eHNrdH
Showing SAE Latents Are Not Atomic Using Meta-SAEs — LessWrong
Bart, Michael and Patrick are joint first authors.  Research conducted as part of MATS 6.0 in Lee Sharkey and Neel Nanda’s streams. Thanks to Mckenna…
Showing SAE Latents Are Not Atomic Using Meta-SAEs — LessWrong
https://www.lesswrong.com/posts/TMAmHh4DdMr4nCSr5/
Showing SAE Latents Are Not Atomic Using Meta-SAEs — LessWrong
weight
pleask/gpt2-small_blocks.8.hook_resid_pre_2304_topk_4_0.001_2000 at main
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
pleask/gpt2-small_blocks.8.hook_resid_pre_2304_topk_4_0.001_2000 at main
https://huggingface.co/pleask/gpt2-small_blocks.8.hook_resid_pre_2304_topk_4_0.001_2000/tree/main
pleask/gpt2-small_blocks.8.hook_resid_pre_2304_topk_4_0.001_2000 at main
meta feature explorer
public dashboard
This app was built in Streamlit! Check it out and visit https://streamlit.io for more awesome community apps. 🎈
public dashboard
https://metasae.streamlit.app/?page=Meta+Feature+Explorer&meta_feature=228
public dashboard
 
 

Table of Contents
Meta SAE features on SAE decoderMeta LatentsMetaSAE

Recommendations

Texonom
Texonom
/
Engineering
Engineering
/Data Engineering/Artificial Intelligence/AI Problem/AI Alignment/Explainable AI/Interpretable AI/Mechanistic interpretability/Weight Interpretability/
Meta SAE
Copyright Seonglae Cho