LLama for single layer 8b
few layers
Removed approximately 30% and 3.5% of features in our Llama 3.1 8B and Llama 3.3 70B SAEs respectively, eliminating features that were determined to be harmful in the SAE analysis.
LLaMa 70b
LLaMa for every layer 8b