NDM

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2025 Aug 5 17:48
Editor
Edited
Edited
2025 Aug 22 22:43
Refs
Refs

Neighbor Distance Minimization

Unlike
Neuron SAE
, rather than finding a single monosemantic feature, it is designed to capture subspaces where sets of mutually exclusive features (=values of variables) are clustered. In other words, it assumes that there are subspaces consisting of similar features in the representation space, and learns by dividing them into predetermined dimension partitions c, minimizing the sum of nearest neighbor distances within each subspace.
Since it's not looking for monosemantic features, it expects fewer dimensions when projecting, and as the proximity distance decreases, the entropy decreases. It learns dimension partitions to reduce this. Clustering is achieved through kNN distance minimization.
 
 
 
 
 

Recommendations