RoPE++

This paper focuses on the fact that existing RoPE only uses the real part of the complex inner product in attention computation and discards the imaginary part. This paper aims to reuse the discarded imaginary component to better capture long-context information.

The imaginary part is also used alongside the real part, similar to having separate attention heads, which helps preserve more positional information. Real attention is strong at capturing local/semantic locality, while imaginary attention tends to focus on more distant tokens, better capturing long-context dependencies. Using both together improves performance on long contexts.

arxiv.org

https://arxiv.org/pdf/2512.07525

RoPE++

Recommendations