Closest Token Lists

Creator

Creator

Seonglae Cho

Created

Created

2024 Dec 28 16:30

Editor

Editor

Seonglae Cho

Edited

Edited

2025 Jul 1 15:49

Refs

Refs

Creating Definition Trees with Ghost Tokens similar to SAE features to extract related token lists

Limitation: Token Lists approach ignores context

Advantage: Can create automated interpretability without external LLMs

Exploring SAE features in LLMs with definition trees and token lists — LessWrong

TL;DR A software tool is presented which includes two separate methods to assist in the interpretation of SAE features. Both use a "feature vector" b…

Exploring SAE features in LLMs with definition trees and token lists — LessWrong

https://www.lesswrong.com/posts/w35H4ski8cHMpnWgX/exploring-sae-features-in-llms-with-definition-trees-and

Exploring SAE features in LLMs with definition trees and token lists — LessWrong

Recommendations

//////////