Texonom
Texonom
/
Engineering
Engineering
/Data Engineering/Artificial Intelligence/AI Problem/AI Alignment/AI Safety/
AI Safety Index
Search

AI Safety Index

Creator
Creator
Seonglae Cho
Created
Created
2024 Oct 21 20:27
Editor
Editor
Seonglae Cho
Edited
Edited
2025 Feb 4 15:30
Refs
Refs
AI Jailbreak
Machine Unlearning
AI Safety Indices
AI Safety Level
Preparedness Framework
 
 
 
 

Savotage

assets.anthropic.com
https://assets.anthropic.com/m/377027d5b36ac1eb/original/Sabotage-Evaluations-for-Frontier-Models.pdf
 
 

Recommendations

Texonom
Texonom
/
Engineering
Engineering
/Data Engineering/Artificial Intelligence/AI Problem/AI Alignment/AI Safety/
AI Safety Index
Copyright Seonglae Cho