Sparsity based AI Evaluation

Creator

Creator

Seonglae Cho

Created

Created

2024 Nov 14 10:25

Editor

Editor

Seonglae Cho

Edited

Edited

2025 Jan 1 21:25

Specific

Specific

Specific

Refs

Refs

Activation Sparsity

Computable

Computable

Computable

Correlation based?

for

how effectively Transformer model uses dimension

how model could be scaled more

how to manipulate dimension size

how

attention head usage spasity

activation sparsity

SAE (non-realtime so need real time)

problem

open source model 한정

ai safety or 적어도 interpretability 랑 연결성 → bias 데이터셋사용

how about

High Correlation means highly steerable

맞는 말이지만 holistic 이랑 너무 겹치고

activation sparsity 랑 sae sparsity 연결지어서? mutual information

Eval targer을 SAE로 할지 LLM으로할지에 따라

sae로 하면 activation vector와 feature sparse vector

keywords

ai jailbreak

red teaming

얼마나 적은 activation sparsity로 같은 text를 표현할 수 있는지 which means model more efficiently use dimension space

Recommendations

////