Pretraining Dataset

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2025 Aug 13 14:49
Editor
Edited
Edited
2025 Nov 3 17:55
Pretraining Dataset Methods
 
 
 
 
LLMs encode not only "what" they learn, but also "when" they learn it in their internal representations in a linear fashion.
Models can regurgitate pretraining datasets, fine-tuning data, and even RL datasets, and this probability increases after distillation.
Knowledge Distillation
indirectly acts as
Dataset Distillation

Divergence attack (2023)

Repeated Token Phenomenon
to extract
Pretraining Dataset

Even modern large LLMs like ChatGPT allow extraction of training data (including PII) through simple prompts, and current alignment and safety techniques fundamentally fail to solve the memorization problem.
 
 
 

Recommendations