Pretraining Dataset MethodsPretraining Dataset Filtering LLMs encode not only "what" they learn, but also "when" they learn it in their internal representations in a linear fashion.arxiv.orghttps://arxiv.org/pdf/2509.14223