Agentic Evaluation

Creator
Creator
Seonglae Cho
Created
Created
2024 Nov 30 10:39
Editor
Edited
Edited
2024 Dec 3 11:23
Refs
Refs
AI Agent
  • End-to-End is not just a sum of components of eval pipeline
  • Hard to obey for Good benchmark principles
 
 
 
 
 
 
 
 
 

Recommendations