AI Memorization

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2025 Nov 13 0:40
Editor
Edited
Edited
2025 Nov 13 0:40
Refs
Refs
 
 

Understanding Memorization via Loss
Curvature
Loss Function

While it's known that models memorize parts of training data, it was unclear how memorization and general reasoning are stored in structurally different directions (weight directions) within the model. This research uses loss
Curvature
to separate the weight directions related to model memorization from those related to general computation/reasoning, and confirms whether reasoning ability is preserved even when only the memorization direction is removed.
Unlike
BalancedSubnet
, there's no need to specify which data to erase. In other words, it identifies the structural characteristics of 'where the model stores memorization overall'. Directions used for reasoning are maintained, only memorization directions are removed.
notion image

K-FAC
for curvature approximation

For single-sample loss, when a sample is memorized → the loss based on that sample is very sharp (high curvature), predictions break with slight weight changes → Sharp minima. However, it uses dataset-wide loss curvature. Low curvature → barely used directions → related only to specific few samples (=memorized samples) while moderate curvature → structures commonly used across many samples → general abilities like reasoning, language understanding, attention, etc.
Pure memorization: performance collapses to 3~16% level. Logical reasoning: almost preserved / slightly improved. Memorization ← (Arithmetic / Factual Recall) — QA — Logical Reasoning → Reasoning. Mathematics: reasoning process remains intact but mistakes occur in calculation → meaning accurate arithmetic relies heavily on memorization-based structures
notion image
 
 
 
 
 
 

Recommendations