Claude's Constitution
Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.
https://www.anthropic.com/constitution

v1
Claude’s Constitution
Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.
https://www.anthropic.com/news/claudes-constitution

v2
Claude's new constitution
A new approach to a foundational document that expresses and shapes who Claude is
https://www.anthropic.com/news/claude-new-constitution
corrigibility
Terrified Comments on Corrigibility in Claude's Constitution — LessWrong
(Previously: Prologue.) • Corrigibility as a term of art in AI alignment was coined as a word to refer to a property of an AI being willing to let it…
https://www.lesswrong.com/posts/K2Ae2vmAKwhiwKEo5/terrified-comments-on-corrigibility-in-claude-s-constitution


Seonglae Cho