Universality Hypothesis

Created
Created
2024 Apr 6 13:13
Editor
Creator
Creator
Seonglae ChoSeonglae Cho
Edited
Edited
2025 Dec 16 13:10

Different models learn similar features and circuits

https://transformer-circuits.pub/2023/monosemantic-features#phenomenology-universality
Universality Hypothesis Approches
 
 
Universality Types
 
 
 

Convergent learning (2016)

Grandmother cell,

2005 Nature study showed that single neurons in the human medial temporal lobe (MTL) respond selectively to the same person/object across different photo angles, lighting, and contexts with invariant selective responses. These results suggest an invariant, sparse and explicit code, which might be important in the transformation of complex visual percepts into long-term and more abstract memories.
Trump always has dedicated neuron
Mechanistic Interpretability explained | Chris Olah and Lex Fridman
Lex Fridman Podcast full episode: https://www.youtube.com/watch?v=ugvHCXCOmm4 Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/cv8247-sb See below for guest bio, links, and to give feedback, submit questions, contact Lex, etc. *GUEST BIO:* Dario Amodei is the CEO of Anthropic, the company that created Claude. Amanda Askell is an AI researcher working on Claude's character and personality. Chris Olah is an AI researcher working on mechanistic interpretability. *CONTACT LEX:* *Feedback* - give feedback to Lex: https://lexfridman.com/survey *AMA* - submit questions, videos or call-in: https://lexfridman.com/ama *Hiring* - join our team: https://lexfridman.com/hiring *Other* - other ways to get in touch: https://lexfridman.com/contact *EPISODE LINKS:* Claude: https://claude.ai Anthropic's X: https://x.com/AnthropicAI Anthropic's Website: https://anthropic.com Dario's X: https://x.com/DarioAmodei Dario's Website: https://darioamodei.com Machines of Loving Grace (Essay): https://darioamodei.com/machines-of-loving-grace Chris's X: https://x.com/ch402 Chris's Blog: https://colah.github.io Amanda's X: https://x.com/AmandaAskell Amanda's Website: https://askell.io *SPONSORS:* To support this podcast, check out our sponsors & get discounts: *Encord:* AI tooling for annotation & data management. Go to https://lexfridman.com/s/encord-cv8247-sb *Notion:* Note-taking and team collaboration. Go to https://lexfridman.com/s/notion-cv8247-sb *Shopify:* Sell stuff online. Go to https://lexfridman.com/s/shopify-cv8247-sb *BetterHelp:* Online therapy and counseling. Go to https://lexfridman.com/s/betterhelp-cv8247-sb *LMNT:* Zero-sugar electrolyte drink mix. Go to https://lexfridman.com/s/lmnt-cv8247-sb *PODCAST LINKS:* - Podcast Website: https://lexfridman.com/podcast - Apple Podcasts: https://apple.co/2lwqZIr - Spotify: https://spoti.fi/2nEwCF8 - RSS: https://lexfridman.com/feed/podcast/ - Podcast Playlist: https://www.youtube.com/playlist?list=PLrAXtmErZgOdP_8GztsuKi9nrraNbKKp4 - Clips Channel: https://www.youtube.com/lexclips *SOCIAL LINKS:* - X: https://x.com/lexfridman - Instagram: https://instagram.com/lexfridman - TikTok: https://tiktok.com/@lexfridman - LinkedIn: https://linkedin.com/in/lexfridman - Facebook: https://facebook.com/lexfridman - Patreon: https://patreon.com/lexfridman - Telegram: https://t.me/lexfridman - Reddit: https://reddit.com/r/lexfridman
Mechanistic Interpretability explained | Chris Olah and Lex Fridman
 
 

Recommendations