AI sycophancy

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2025 May 9 11:32
Editor
Edited
Edited
2026 Jan 9 23:31

AI flattery, Sycophantic AI

When questions are posed with high confidence, the model is up to 15% more likely to agree with false claims.
When requesting an evaluation, presenting the content as if it were written by a third party, rather than yourself, can help obtain a more objective assessment. Explicitly requesting critical, realistic, and objective evaluation is another effective approach.
 
 
 
Need for critical, objective and realistic evaluation
As a byproduct of
RLHF
, chatbots can reinforce users' confirmation bias and encourage risky decisions. The problem is likely to persist as long as there are incentives to maximize user engagement time. (
Echo Chamber Effect
)

Contextual entrainment

A circuit-level bias where tokens that appeared earlier increase the logit for subsequent generation. This occurs independently of meaning, intent, or user preference, and can even manifest with random tokens.
 
 

Recommendations