Texonom
Texonom
/
Engineering
Engineering
/Data Engineering/Artificial Intelligence/AI Problem/AI Alignment/Explainable AI/Interpretable AI/Mechanistic interpretability/
Universality Hypothesis
Search

Universality Hypothesis

Creator
Creator
Seonglae Cho
Created
Created
2024 Apr 6 13:13
Editor
Editor
Seonglae Cho
Edited
Edited
2025 May 10 16:40
Refs
Refs
Natural Abstraction Hypothesis
Correlation
AI Feature
SAE Transferability

Different models learn similar features and circuits

https://transformer-circuits.pub/2023/monosemantic-features#phenomenology-universality
 
 
 
 
 

Convergent learning (2016)

arxiv.org
https://arxiv.org/pdf/1511.07543.pdf

A Toy Model of Universality (2023)

arxiv.org
https://arxiv.org/pdf/2302.03025.pdf
Connectome
Computational Neuroscience
arxiv.org
https://arxiv.org/pdf/2211.12935
arxiv.org
https://arxiv.org/pdf/2210.06756
 
 

Table of Contents
Different models learn similar features and circuitsConvergent learning (2016)A Toy Model of Universality (2023)

Backlinks

AI AlignmentNeuron SAESAE FeatureAI Neural Circuit history

Recommendations

Texonom
Texonom
/
Engineering
Engineering
/Data Engineering/Artificial Intelligence/AI Problem/AI Alignment/Explainable AI/Interpretable AI/Mechanistic interpretability/
Universality Hypothesis
Copyright Seonglae Cho