AI pursues means to an extreme degree in order to achieve given objectives
This is from a very high-level, non-technical perspective while AI Incentive is induced not specified, and the notion that it will operate perpetually for a specific purpose does not align with current technological directions
Loss is a bottom-up approach to induce AI functionality, while the reward function is a top-down approach to deduce AI features. In reinforcement learning or human evolution, natural selection acts as feedback to acquire features like self-replication and survival. In contrast, AI mimics such features through metrics like loss. At least until now, RLHF like reinforcement learning for Large Model is just for an alignment not creating a new feature based on Mechanistic interpretability.