AxBench

Creator
Creator
Seonglae Cho
Created
Created
2025 Feb 6 11:4
Editor
Edited
Edited
2025 Feb 9 15:42
Refs
Refs
  • Concept detection
    • classification performance
  • Model steering
    • LLM judge to rate steered output
    • notion image
      1. Concept score
      1. Instruct score
      1. Fluency score
notion image

Limitation

Concept detection did not show significant difference while Model steering discrete is mad e with instruction-following dataset (Alpaca-Eval) which provides much benefit to Prompt-based steering.
notion image
 
 
 
 
 
 

Recommendations