Texonom
Texonom
/
Engineering
Engineering
/Data Engineering/Artificial Intelligence/AI Problem/AI Alignment/Explainable AI/Interpretable AI/AI Feature/
AI Feature Composition
Search

AI Feature Composition

Creator
Creator
Seonglae Cho
Created
Created
2025 Feb 1 22:50
Editor
Editor
Seonglae Cho
Edited
Edited
2025 Feb 1 23:0
Refs
Refs
Superposition Hypothesis
Lower layers use composition for low-level features, deeper layers exploit superposition by packing polysemantic features into residual bandwidth, and output layers recompose information, making it interpretable by a linear head.
While this is a widely known insight, it's difficult to find clear proof or the original paper that first proposed it
 
 
 
 
 
Distributed Representations: Composition & Superposition
Distributed representations are a classic idea in both neuroscience and connectionist approaches to AI. We're often asked how our work on superposition relates to it. Since publishing our original paper on superposition, we've had more time to reflect on the relationship between the topics and discuss it with people, and wanted to expand on our earlier discussion in the related work section and share a few thoughts. (We care a lot about superposition and the structure of distributed representations because decomposing representations into independent components is necessary to escape the curse of dimensionality and understand neural networks.)
Distributed Representations: Composition & Superposition
https://transformer-circuits.pub/2023/superposition-composition/index.html
 

Recommendations

Texonom
Texonom
/
Engineering
Engineering
/Data Engineering/Artificial Intelligence/AI Problem/AI Alignment/Explainable AI/Interpretable AI/AI Feature/
AI Feature Composition
Copyright Seonglae Cho