Neutrality

Creator

Creator

Seonglae Cho

Created

Created

2024 Nov 18 0:45

Editor

Editor

Seonglae Cho

Edited

Edited

2026 Mar 6 16:31

Refs

Refs

Neutrality — LessWrong

Midjourney, “infinite library” • I’ve had post-election thoughts percolating, and the sense that I wanted to synthesize something about this moment,…

Neutrality — LessWrong

https://www.lesswrong.com/posts/WxnuLJEtRzqvpbQ7g

Neutrality — LessWrong

Inter-model evaluation agreement rates:

Claude Sonnet 4.5 ↔ GPT-5

92% agreement

Claude Opus 4.1 ↔ Sonnet 4.5

94% agreement

Human evaluator agreement is around 85% → Model graders are more consistent than humans.

Measuring political bias in Claude

Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.

https://www.anthropic.com/news/political-even-handedness

Measuring political bias in Claude

jagged

Recommendations

///