AI Safety

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2023 Jun 13 11:43
Editor
Edited
Edited
2026 Jun 23 0:28

(induced) incentive is key for safety

Risks such as generating illicit advice, choosing stereotyped responses, and succumbing to known jailbreaks

Communities & Forums

  • OpenAI
Slow take-off
is important because we need to ask: has there ever been a case where thorough consideration of safety resulted in a completely secure final product? Safety rules are written in blood. The counterargument is that prevented accidents don't make headlines, but it's still necessary to test systems with minimal risk in controlled environments. That's why gradually releasing AI models is also a strategy for safe AGI at the frontier.
AI Safety Notion
 
 
 
 

Concrete Problems in AI Safety (2016)

5 risks: Side effects,
AI Reward Hacking
, Non-scalable supervision, Non-safe exploration,
Distribution Shift
arxiv.org
George Hotz vs Eliezer Yudkowsky AI Safety Debate
George Hotz and Eliezer Yudkowsky will hash out their positions on AI safety, acceleration, and related topics. You can watch live on Twitter as well: https://twitter.com/i/broadcasts/1nAJErpDYgRxL
George Hotz vs Eliezer Yudkowsky AI Safety Debate
OpenAI, DeepMind and Anthropic to give UK early access to foundational models for AI safety research
UK prime minister Rishi Sunak has kicked off London Tech Week by telling conference goers that OpenAI, Google DeepMind and Anthropic have committed to provide "early or priority access" to their AI models to support safety research.
OpenAI, DeepMind and Anthropic to give UK early access to foundational models for AI safety research
youtube
Robert Miles AI Safety
Videos about Artificial Intelligence Safety Research, for everyone. AI is leaping forward right now, it's only a matter of time before we develop true Artificial General Intelligence, and there are a lot of different ways that this could go badly wrong for us. Putting aside the science fiction, this channel is about AI Safety research - humanity's best attempt to foresee the problems AI might pose and work out ways to ensure that our AI developments are safe and beneficial.
Robert Miles AI Safety
 
 

 

Recommendations