AI sycophancy

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2025 May 9 11:32
Editor
Edited
Edited
2026 Feb 18 17:1

AI flattery, Sycophantic AI

When questions are posed with high confidence, the model is up to 15% more likely to agree with false claims.
When requesting an evaluation, presenting the content as if it were written by a third party, rather than yourself, can help obtain a more objective assessment. Explicitly requesting critical, realistic, and objective evaluation is another effective approach.
 
 
 
Need for critical, objective and realistic evaluation
When you should lie to the language model
Here’s an unreasonably effective trick for working with AIs: always pretend that your work was produced by someone else. The problem is that current-generation…
As a byproduct of
RLHF
, chatbots can reinforce users' confirmation bias and encourage risky decisions. The problem is likely to persist as long as there are incentives to maximize user engagement time. (
Echo Chamber Effect
)
Sycophancy is the first LLM "dark pattern"
People have been making fun of OpenAI models for being overly sycophantic for months now. I even wrote a post advising users to pretend that their work was…

Contextual entrainment

A circuit-level bias where tokens that appeared earlier increase the logit for subsequent generation. This occurs independently of meaning, intent, or user preference, and can even manifest with random tokens.
aclanthology.org
Disempowerment
Disempowerment patterns in real-world AI usage
Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.
Disempowerment patterns in real-world AI usage
 
 

Recommendations