Ordered Action Tokenization
How to tokenize continuous actions into discrete tokens for input or generation? OAT represents actions as a sequence of discrete tokens instead of outputting a single vector at once. Since actions are typically trajectories over time, they are grouped into chunks.
Creating latents with Transformer encoder: action chunk → Transformer → latent register tokens, where register tokens are slots that summarize the actions. Each latent is converted to a codebook index using FSQ (Finite Scalar Quantization). Using nested dropout + causal attention, they are sorted by information hierarchy. This allows reconstruction even with just the prefix.
Like JPEG progressive encoding
- First token = blurry image
- Adding tokens = progressively sharper
OAT
Ordered Action Tokenization
https://ordered-action-tokenization.github.io/
OAT: Ordered Action Tokenization
Autoregressive policies offer a compelling foundation for scalable robot learning by enabling discrete abstraction, token-level reasoning, and flexible inference. However, applying autoregressive...
https://arxiv.org/abs/2602.04215

Chaoqi Liu on Twitter / X
https://t.co/FBGI9OsILn— Chaoqi Liu (@liu730chaoqi) February 7, 2026
https://x.com/liu730chaoqi/status/2020268786046390357?s=20

Seonglae Cho