Exemplar Partitioning exemplar-partitioningjessicarumbelow • Updated 2026 May 25 21:13
exemplar-partitioning
jessicarumbelow • Updated 2026 May 25 21:13
It is a method that partitions a language model’s activation space using Voronoi partitions to uncover interpretable structure.

An Introduction to Exemplar Partitioning for Mechanistic Interpretability — LessWrong
Voronoi partitions on activations reveal interpretable structure with orders of magnitude less compute than SAEs.
https://www.lesswrong.com/posts/RroeHBSkBXXDsrryq/an-introduction-to-exemplar-partitioning-for-mechanistic-1

Seonglae Cho