Deepseek R1

Creator
Creator
Seonglae Cho
Created
Created
2025 Jan 21 12:2
Editor
Edited
Edited
2025 Mar 28 21:31
Refs

Deekseek Reasoner-1

based on DeepSeek-V3-Base and RL tuned with only 20GB dataset
Alignment is not rigid, so good for jailbreak testing
 
 
 
fully open sourced
model or distill
tech report
only RL with limitations (repetitive answers, low readability, language mixing)
 
 

Recommendations