NAH
AI가 학습하는 효율적인 추상화는 환경 자체의 특성을 반영
- Abstractability - The physical world can be abstracted, and it can be summarized with information of a much lower dimension than the overall complexity of the system
- Human-Compatibility - Low-dimensional abstraction aligns with the abstractions humans use
- Convergence - Various cognitive structures are likely to use similar abstractions
지금 world modeling을 가장 잘하는 건 시각적으로는 Noise Reduction 이고 언어적으로는 Attention Mechanism 이다.
Neuron Activation in Left Prefrontal cortex respond to work such as AI Neuron Activation (actually word embedding in the paper)
Semantic encoding during language comprehension at single-cell resolution
World model Interpretability with Internal Interface Theory
If the way AI interacts with various modules through internal interfaces is consistently formed, the possibility increases that humans can understand the format of these interfaces and interpret the entire world model at once.