- gemma2b_mmlu_20_ppo_1e-05_0621_074052_30.0_sparse
- gemma2b_mmlu_24_ppo_1e-05_0622_151551_30.0
Without selection
multi layer shared 코드라 다시해야할듯 only last k
그냥 norm 증가해서 영향만 주는거고
select_action 함수 학습 제대로 안되고 있는걸수도double layer manipulation was highest but more than 3 rather lower than single layer manipulation and decreased
graph?
- sae 더 큰 spase dictionary 사용 딱히 의미없
Control RL MMLU Models
LLama
Baseline
- white paper 66.7% 5 shot
- base non-select 61.41% 0 shot
- base select 61.42% 0 shot (almost non hallucination)
Single layer
- 30th 61.64% 10min
- 30th 61.69% 5min
- 30th 62.01% 5min decode
- 30th 61.62% 1min
- 30th 61.12% 20min
- 24th 57.66% 20min
- 24th 61.91 5 minimum
- 28th 61.86 1
- 62.33% 20th
Corrsteer
- 61.71%
meta-llama/Llama-3.1-8B · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
https://huggingface.co/meta-llama/Llama-3.1-8B
Seonglae Cho