Texonom
Texonom
/
Engineering
Engineering
/Data Engineering/Artificial Intelligence/Machine Learning/Reinforcement Learning/Reinforcement Learning Method/Reward model/
Approval Reward
Search

Approval Reward

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2025 Dec 26 18:58
Editor
Editor
Seonglae ChoSeonglae Cho
Edited
Edited
2025 Dec 26 18:58
Refs
Refs
 
 
 
 
 
 
6 reasons why “alignment-is-hard” discourse seems alien to human intuitions, and vice-versa — LessWrong
TL;DR: AI alignment has a culture clash. On one side, the “technical-alignment-is-hard” / “rational agents” school-of-thought argues that we should expect future powerful AIs to be power-seeking ruthless consequentialists. On the other side, people observe that both humans and LLMs are obviously capable of behaving like, well, not that. The latter group accuses the former of head-in-the-clouds abstract theorizing gone off the rails, while the former accuses the latter of mindlessly assuming that the future will always be the same as the present, rather than trying to understand things …
6 reasons why “alignment-is-hard” discourse seems alien to human intuitions, and vice-versa — LessWrong
https://www.lesswrong.com/posts/d4HNRdw6z7Xqbnu5E/6-reasons-why-alignment-is-hard-discourse-seems-alien-to
6 reasons why “alignment-is-hard” discourse seems alien to human intuitions, and vice-versa — LessWrong
 
 

Recommendations

Texonom
Texonom
/
Engineering
Engineering
/Data Engineering/Artificial Intelligence/Machine Learning/Reinforcement Learning/Reinforcement Learning Method/Reward model/
Approval Reward
Copyright Seonglae Cho