Pretraining scaling and reasoning model Test-time Scaling represent the third axis of scaling. How long we can continue training and educating models will become a competitive advantage in the industry, and this intersects with Catastrophic forgetting to create an intelligence that doesn't die, appearing opposite to humans. However, the catastrophic forgetting problem also applies to humans, making it a more fundamental and unavoidable issue.
AI optimized for continual learning may emerge not as superintelligence but as super-learners, appearing as distinct individuals as we once imagined. There has been excessive faith in general AI due to the limitations of narrow AI before meta learners and few-shot learners. However, if we understand that AI's more fundamental paradigm lies not in ability itself but in learning prior ability, we can see that advanced narrow AI capable of learning anything in a general way is also valuable. Why this is good is it inducing intelligence not knowledge.
Continual Learning Notion

Ilya Sutskever 2025

Seonglae Cho
![[AI 논문리뷰] Continual Learning on Deep Learning](https://www.notion.so/image/https%3A%2F%2Fmiro.medium.com%2Fv2%2Fresize%3Afill%3A152%3A152%2F1*sHhtYhaCe2Uc3IU0IgKwIQ.png?table=block&id=eb19e61e-9b21-4d77-9e91-d16b6932a3eb&cache=v2)
![[AI 논문리뷰] Continual Learning on Deep Learning](https://www.notion.so/image/https%3A%2F%2Fmiro.medium.com%2Fv2%2Fresize%3Afit%3A987%2F1*8smZsZHfrRm_xyjhW_hf-w.png?table=block&id=eb19e61e-9b21-4d77-9e91-d16b6932a3eb&cache=v2)
