Texonom
Texonom
/
Engineering
Engineering
/Data Engineering/Artificial Intelligence/AI Research/AI Researcher/
Adam Karvonen
Search

Adam Karvonen

Creator
Creator
Seonglae Cho
Created
Created
2025 Feb 27 13:21
Editor
Editor
Seonglae Cho
Edited
Edited
2025 Feb 27 13:21
Refs
Refs
  • SAEBench
 
 
 
 
 
Sieve: SAEs Beat Baselines on a Real-World Task (A Code Generation Case Study) | Tilde
Our methods achieve Pareto dominance on the axis of task success rate vs task constraint satisfaction vs general model performance.
Sieve: SAEs Beat Baselines on a Real-World Task (A Code Generation Case Study) | Tilde
https://www.tilderesearch.com/blog/sieve
Sieve: SAEs Beat Baselines on a Real-World Task (A Code Generation Case Study) | Tilde
An Intuitive Explanation of Sparse Autoencoders for LLM Interpretability
Sparse Autoencoders (SAEs) have recently become popular for interpretability of machine learning models (although sparse dictionary learning has been around since 1997). Machine learning models and LLMs are becoming more powerful and useful, but they are still black boxes, and we don’t understand how they do the things that they are capable of. It seems like it would be useful if we could understand how they work.
https://adamkarvonen.github.io/machine_learning/2024/06/11/sae-intuitions.html
 
 

Recommendations

Texonom
Texonom
/
Engineering
Engineering
/Data Engineering/Artificial Intelligence/AI Research/AI Researcher/
Adam Karvonen
Copyright Seonglae Cho