Texonom
Texonom
/
Engineering
Engineering
/Data Engineering/Artificial Intelligence/AI Problem/AI Alignment/AI Control/
Distributed control
Search

Distributed control

Creator
Creator
Seonglae Cho
Created
Created
2025 Apr 14 13:20
Editor
Editor
Seonglae Cho
Edited
Edited
2025 Apr 14 13:20
Refs
Refs
 
 
 
 
 
 
Adaptive Deployment of Untrusted LLMs Reduces Distributed Threats
As large language models (LLMs) become increasingly capable, it is prudent to assess whether safety measures remain effective even if LLMs intentionally try to bypass them. Previous work...
Adaptive Deployment of Untrusted LLMs Reduces Distributed Threats
https://arxiv.org/abs/2411.17693
Adaptive Deployment of Untrusted LLMs Reduces Distributed Threats
 
 

Recommendations

Texonom
Texonom
/
Engineering
Engineering
/Data Engineering/Artificial Intelligence/AI Problem/AI Alignment/AI Control/
Distributed control
Copyright Seonglae Cho