Loading views...

Sparsity based AI Evaluation

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2024 Nov 14 10:25
Editor
Edited
Edited
2025 Jan 1 21:25
Specific
Specific
Specific
Computable
Computable
Computable

Correlation based?

for
  • how effectively Transformer model uses dimension
  • how model could be scaled more
  • how to manipulate dimension size
how
  • attention head usage spasity
  • activation sparsity
  • SAE (non-realtime so need real time)

problem

  • open source model 한정
  • ai safety or 적어도 interpretability 랑 연결성 → bias 데이터셋사용

how about

  • High Correlation means highly steerable
  • 맞는 말이지만 holistic 이랑 너무 겹치고
  • activation sparsity 랑 sae sparsity 연결지어서? mutual information

Eval targer을 SAE로 할지 LLM으로할지에 따라

sae로 하면 activation vector와 feature sparse vector

keywords

  • ai jailbreak
  • red teaming
얼마나 적은 activation sparsity로 같은 text를 표현할 수 있는지 which means model more efficiently use dimension space
 
 
 
 
 

Recommendations