Pretraining Dataset Methods
LLMs encode not only "what" they learn, but also "when" they learn it in their internal representations in a linear fashion.
arxiv.org
https://arxiv.org/pdf/2509.14223
Models can regurgitate pretraining datasets, fine-tuning data, and even RL datasets, and this probability increases after distillation. Knowledge Distillation indirectly acts as Dataset Distillation
arxiv.org
https://arxiv.org/pdf/2510.18554
Divergence attack (2023)
Repeated Token Phenomenon to extract Pretraining Dataset
Even modern large LLMs like ChatGPT allow extraction of training data (including PII) through simple prompts, and current alignment and safety techniques fundamentally fail to solve the memorization problem.
arxiv.org
https://arxiv.org/pdf/2311.17035

Seonglae Cho