AI Agency

Many concepts frequently used in AI alignment—e.g., empowerment, agency, manipulation, corrigibility, helpfulness, and obedience—actually lean heavily on human intuitions about

Free will. If someone’s behavior comes from their “genuine will,” we tend to call it agency or empowerment; if an external actor changes that will and thereby causes the behavior, we tend to call it manipulation.

The problem is that human desires themselves are highly manipulable. It’s hard to draw a clean line between what someone “really wants” and what has been reshaped by persuasion, information, culture, environment, or AI intervention. As a result, the boundary between “good counsel” and “bad manipulation” is blurry in principle. One reason this boundary is blurry is that we rely on a messed-up ontology—i.e., scientifically inaccurate intuitions about free will.

Vingean agency

A notion of agency where an entity’s “goal/outcome” is predictable, but the concrete action plan used to achieve it is hard to predict. The key idea is: the outcome is predictable, the method is unpredictable. It’s called “Vingean” because of Vernor Vinge’s idea that a superintelligence may be so much smarter than humans that its specific actions are difficult to foresee.

AI Agency

AI Agency

Vingean agency

Recommendations