LLMs mimic vast amounts of human text, showing goals and behaviors similar to humans. Unlike classical 'paperclip maximizer' scenarios or the Waluigi Effect, they don't exhibit extreme optimization behaviors. However, reasoning models optimized for tasks like "mathematical verification" might revert to unusual optimization strategies similar to AlphaZero. Just as humans compete and engage in risky behavior, AI and humanity might compete for resources and power. Technical solutions alone are insufficient → we need 'cultural alignment' through legal, economic, and cultural institutions.
Paperclip maximizer is Nick Bostrom's thought experiment
If you input a simple goal reward to a superintelligence, it will use any means necessary to achieve it, potentially consuming all of Earth's iron and energy to continue production indefinitely
The risk of superintelligence is not simply at the level of a Paperclip maximizer

Seonglae Cho