NAH
The efficient abstractions learned by AI reflect the inherent characteristics of the environment itself
- Abstractability - The physical world can be abstracted, and it can be summarized with information of a much lower dimension than the overall complexity of the system
- Human-Compatibility - Low-dimensional abstraction aligns with the abstractions humans use
- Convergence - Various cognitive structures are likely to use similar abstractions
Currently, the best world modeling approaches are Noise Reduction for visual processing and Attention Mechanism for language processing.
Multimodal Neuron from OpenAI (2021, Gabriel Goh)
In 2005, a letter published in Nature described human neurons responding to specific people, such as Jennifer Aniston or Halle Berry. The exciting thing was that they did so regardless of whether they were shown photographs, drawings, or even images of the person’s name. The neurons were multimodal. You are looking at the far end of the transformation from metric, visual shapes to conceptual information.
Neuron Activation in Left Prefrontal cortex respond to work such as AI Neuron Activation (actually word embedding in the paper)
Semantic encoding during language comprehension at single-cell resolution
World model Interpretability with Internal Interface Theory
If the way AI interacts with various modules through internal interfaces is consistently formed, the possibility increases that humans can understand the format of these interfaces and interpret the entire world model at once.
key claims theorems and critiques
Proposal (Wentworth, 2021)
Emergent Computations in Artificial Neural Networks and Real Brains
Even the discovery of similar circuits in humans and AI supports this claim