Self-improving without label SL(Supervised Learning) phase RL(Reinforcement Learning) phase Anthropic on Twitter / XIn our paper, we describe how we’ve used Constitutional AI to train better and more harmless AI assistants without any human feedback labels for harms. This approach leads to models that are safer and also more helpful. pic.twitter.com/PXvWk3fz0o— Anthropic (@AnthropicAI) December 16, 2022 https://twitter.com/AnthropicAI/status/1603791168495489030arxiv.orghttps://arxiv.org/pdf/2212.08073.pdfConstitutional AI: Harmlessness from AI FeedbackWe show that language models can learn to follow a set of simple, natural language principles via self-improvement, and we use this new method to train a more harmless assistant.https://www.anthropic.com/index/constitutional-ai-harmlessness-from-ai-feedback