Meta SAE features on SAE decoder
MetaSAEs are sparse autoencoders (SAEs) trained on the decoder directions (y-axis) of another SAE.

Meta Latents

MetaSAE
openreview.net
https://openreview.net/pdf?id=9ca9eHNrdH
Showing SAE Latents Are Not Atomic Using Meta-SAEs — LessWrong
Bart, Michael and Patrick are joint first authors. Research conducted as part of MATS 6.0 in Lee Sharkey and Neel Nanda’s streams. Thanks to Mckenna…
https://www.lesswrong.com/posts/TMAmHh4DdMr4nCSr5/
weight
pleask/gpt2-small_blocks.8.hook_resid_pre_2304_topk_4_0.001_2000 at main
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
https://huggingface.co/pleask/gpt2-small_blocks.8.hook_resid_pre_2304_topk_4_0.001_2000/tree/main
meta feature explorer
public dashboard
This app was built in Streamlit! Check it out and visit https://streamlit.io for more awesome community apps. 🎈
https://metasae.streamlit.app/?page=Meta+Feature+Explorer&meta_feature=228


Seonglae Cho