PAUSE Token

Think before you speak, Filler token

PAUSE tokens can be strategically inserted not only during inference but also at appropriate positions within the context, similar to how humans pause to think deeply before responding.

These tokens serve to enhance the

Context Vector by enabling the reuse of attention layers multiple times rather than being limited to a fixed number of passes. This mechanism works in direct contrast to

MoD.

Potential Extensions:

Mimicking human behavior: Rather than having pause tokens operate only in an autoregressive manner, implementing bi-directional processing (without padding) could yield improved results.

Balancing deliberation and fluency: Just as humans sometimes speak in a stream of consciousness while other times engage in careful deliberation, models could benefit from incorporating both modes of processing.

3SUM, 2SUM

arxiv.org

https://arxiv.org/pdf/2404.15758

arxiv.org

https://arxiv.org/pdf/2310.02226.pdf

PAUSE Token

Think before you speak, Filler token

3SUM, 2SUM

Backlinks

Recommendations