An implicit CoT framework that simultaneously trains natural language CoT (teacher) within the same model while aligning the hidden state of a specific token (e.g., "The answer is:") before answer generation using L1 (=self-distillation) to compress and replace CoT with a small number of continuous latent thoughts
arxiv.org
https://arxiv.org/pdf/2502.21074

Seonglae Cho