Analyzing Sparse Autoencoders (SAEs) from Gemma Scope — pyvene 0.1.2 documentation
This tutorial aims to (1) reproduce and (2) extend some of the results in the Gemma Scope (SAE) tutorial in notebook for interpreting latents of SAEs. This tutorial also shows basic model steering with SAEs. This notebook is built as a show-case for the Gemma 2 2B model as well as its SAEs. However, this tutorial can be extended to any other model types and their SAEs.
https://stanfordnlp.github.io/pyvene/tutorials/basic_tutorials/Sparse_Autoencoder.html