OAT

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2026 Feb 24 18:23
Editor
Edited
Edited
2026 Feb 24 18:27

Ordered Action Tokenization

How to tokenize continuous actions into discrete tokens for input or generation? OAT represents actions as a sequence of discrete tokens instead of outputting a single vector at once. Since actions are typically trajectories over time, they are grouped into chunks.
Creating latents with Transformer encoder: action chunk → Transformer → latent register tokens, where register tokens are slots that summarize the actions. Each latent is converted to a codebook index using FSQ (Finite Scalar Quantization). Using nested dropout + causal attention, they are sorted by information hierarchy. This allows reconstruction even with just the prefix.

Like JPEG progressive encoding

  • First token = blurry image
  • Adding tokens = progressively sharper
 
OAT
Ordered Action Tokenization
OAT: Ordered Action Tokenization
Autoregressive policies offer a compelling foundation for scalable robot learning by enabling discrete abstraction, token-level reasoning, and flexible inference. However, applying autoregressive...
OAT: Ordered Action Tokenization
Chaoqi Liu on Twitter / X
https://t.co/FBGI9OsILn— Chaoqi Liu (@liu730chaoqi) February 7, 2026
 
 

Recommendations