More intelligence for free by scaling
Andrej Karpathy said that It will be surprisingly small since the current models are wasting a ton of capacity remembering stuff that does not matter. There will be a cognitive core (math, physics, computing, predicting) like human brain similar to brain’s layered structure.
Recommended that training tokens should be scaled linearly with model size. The constraints on scaling test-time compute approach differ substantially from those of LLM pretraining.
AI Scaling Notion
AI Scaling Methods
Scaling Law (OpenAI 2020)
Primate neural architecture that’s really scalable in comparison to the brains of other kinds of species, analogous to how transformers have better scaling curves than LSTMs and RNNs.