TDB

Creator

Creator

Seonglae Cho

Created

Created

2024 Mar 9 11:37

Editor

Editor

Seonglae Cho

Edited

Edited

2024 Oct 24 11:41

Refs

Refs

Monosemanticity

Transformer Debugger

Component can be an attention head or neuron, or autoencoder latent

Each node only exists in one forward/backward pass. If you modify the prompt and rerun, that would create different nodes

Residual stream stands for

Circuit is a set of nodes that work together to perform some behavior/reasoning.

TDB Usages

TDB Reconstitute

Terms

https://github.com/openai/transformer-debugger/blob/main/terminology.md

In this video, I will walk you through Transformer Debugger, a tool developed to perform exploratory analyses on activations of Transformer language models. Similar to a Python debugger, Transformer Debugger allows you to step through language model outputs, trace important activations, and analyze upstream activations. I will explain how to use prompts, view model outputs, and interpret the node table.

1. TDB Intro

https://www.loom.com/share/721244075f12439496db5d53439d2f84?sid=8445200e-c49e-4028-8b8e-3ea8d361dec0

1. TDB Intro

2. TDB neuron-viewer pages

In this video, I explain how the Transformer Debugger tool provides a prompt-centric view of important activations and model components in a Transformer model. I demonstrate how to navigate the tool and interpret the visualizations, such as the color map indicating attention strength. I also discuss the significance of specific attention heads and MLP neurons, highlighting their role in capturing patterns and making predictions.

2. TDB neuron-viewer pages

https://www.loom.com/share/21b601b8494b40c49b8dc7bfd1dc6829?sid=ee23c00a-9ede-4249-b9d7-c2ba15993556

2. TDB neuron-viewer pages

Recommendations

//////////