Logit Lens

Creator

Creator

Seonglae Cho

Created

Created

2024 Oct 14 1:33

Editor

Editor

Seonglae Cho

Edited

Edited

2025 Feb 16 23:39

Refs

Refs

https://arxiv.org/pdf/2402.09221

interpreting GPT: the logit lens — AI Alignment Forum

This post relates an observation I've made in my work with GPT-2, which I have not seen made elsewhere. …

interpreting GPT: the logit lens — AI Alignment Forum

https://www.alignmentforum.org/posts/AcKRB8wDpdaN6v6ru/interpreting-gpt-the-logit-lens

interpreting GPT: the logit lens — AI Alignment Forum

Recommendations

//////////