LLama SAE

Creator

Creator

Seonglae Cho

Created

Created

2025 Jan 26 19:1

Editor

Editor

Seonglae Cho

Edited

Edited

2025 Jun 2 10:19

Refs

Refs

fnlp/Llama-Scope · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

https://huggingface.co/fnlp/Llama-Scope

fnlp/Llama-Scope · Hugging Face

LLama for single layer 8b

few layers

andyrdt/saes-llama-3.1-8b-instruct at main

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

https://huggingface.co/andyrdt/saes-llama-3.1-8b-instruct/tree/main

andyrdt/saes-llama-3.1-8b-instruct at main

Removed approximately 30% and 3.5% of features in our Llama 3.1 8B and Llama 3.3 70B SAEs respectively, eliminating features that were determined to be harmful in the SAE analysis.

LLaMa 70b

Mapping the Latent Space of Llama 3.3 70B

We have trained sparse autoencoders (SAEs) on Llama 3.3 70B and released the interpreted model for general access via an API.

Mapping the Latent Space of Llama 3.3 70B

https://www.goodfire.ai/papers/mapping-latent-spaces-llama

Mapping the Latent Space of Llama 3.3 70B

Goodfire/Llama-3.3-70B-Instruct-SAE-l50 · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

https://huggingface.co/Goodfire/Llama-3.3-70B-Instruct-SAE-l50

Goodfire/Llama-3.3-70B-Instruct-SAE-l50 · Hugging Face

LLaMa for every layer 8b

Understanding and Steering Llama 3 with Sparse Autoencoders

We present a novel approach to interpreting and controlling large language model behavior with sparse autoencoders, demonstrated through a desktop interface for Llama-3-8B.

Understanding and Steering Llama 3 with Sparse Autoencoders

https://www.goodfire.ai/papers/understanding-and-steering-llama-3

Understanding and Steering Llama 3 with Sparse Autoencoders

Goodfire/Llama-3.1-8B-Instruct-SAE-l19 · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

https://huggingface.co/Goodfire/Llama-3.1-8B-Instruct-SAE-l19

Goodfire/Llama-3.1-8B-Instruct-SAE-l19 · Hugging Face

Recommendations

/////////////