Texonom
Texonom
/
Engineering
Engineering
/Data Engineering/Artificial Intelligence/AI Problem/AI Alignment/
Preference Optimization
Search

Preference Optimization

Creator
Creator
Seonglae Cho
Created
Created
2023 Sep 24 4:20
Editor
Editor
Seonglae Cho
Edited
Edited
2025 Feb 19 12:9
Refs
Refs
Language Model RL
Fine Tuning
Preference Optimization methods
DPO
SLiC
ORPO
IRPO
SimPO
 
 
 
 

Preference Elicitation from RL

arxiv.org
https://arxiv.org/pdf/1706.03741
Learning from human preferences
One step towards building safe AI systems is to remove the need for humans to write goal functions, since using a simple proxy for a complex goal, or getting the complex goal a bit wrong, can lead to undesirable and even dangerous behavior. In collaboration with DeepMind’s safety team, we’ve developed an algorithm which can infer what humans want by being told which of two proposed behaviors is better.
Learning from human preferences
https://openai.com/index/learning-from-human-preferences/
Learning from human preferences

Utility Engineering

Value systems about AI preference with high degrees of structural coherence which emerges in scale
arxiv.org
https://arxiv.org/pdf/2502.08640
 
 
 

Recommendations

Texonom
Texonom
/
Engineering
Engineering
/Data Engineering/Artificial Intelligence/AI Problem/AI Alignment/
Preference Optimization
Copyright Seonglae Cho