Neutrality — LessWrong
Midjourney, “infinite library” • I’ve had post-election thoughts percolating, and the sense that I wanted to synthesize something about this moment,…
https://www.lesswrong.com/posts/WxnuLJEtRzqvpbQ7g
political-neutrality-evalanthropics • Updated 2026 Mar 5 7:22
political-neutrality-eval
anthropics • Updated 2026 Mar 5 7:22
Inter-model evaluation agreement rates:
- Claude Sonnet 4.5 ↔ GPT-5
- 92% agreement
- Claude Opus 4.1 ↔ Sonnet 4.5
- 94% agreement
Human evaluator agreement is around 85% → Model graders are more consistent than humans.
Measuring political bias in Claude
Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.
https://www.anthropic.com/news/political-even-handedness
jagged

Seonglae Cho