Register Token

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2025 Jan 8 20:36
Editor
Edited
Edited
2025 Jun 13 17:47
The Artifacts appearing in feature maps are high L2 norm tokens that occur in background regions with low information content in images, showing how the model reuses them for internal calculations. These tokens contain global rather than local information, which hinders the model's interpretability and density.
Therefore, by adding Register Tokens to the input sequence to guide the model to secure separate computational space, performance is improved in downstream tasks such as object detection and density prediction.
 
 
 

Vision Transformers Need Registers

Inside Transformers, vector activation norms show that CLS tokens become excessive attention sinks, causing distortions in visualization and performance. Therefore, register tokens were trained to act as a type of register that stores global image information like global memory.

Vision Transformers Don’t Need Trained Registers

This paper discovers register neurons in MLPs that generate high-norm tokens and directly intervenes by transferring high-norm activations to new tokens early on, achieving similar effects in pre-trained models without additional training (test-time register).
 
 
 

Recommendations