AI Memory Capacity

Creator

Creator

Created

Created

2025 Jun 9 0:48

Editor

Editor

Edited

Edited

2025 Jun 9 1:17

Refs

Refs

Presents a new method for rigorously distinguishing and measuring information that language models "inadvertently" memorize from data versus information gained through generalization.

Definition of Memory: Quantifies in bits how much information model θ stores about a specific sample x using
Kolmogorov Complexity.

Through random bit sequence experiments, measured pure "memorization" (inadvertent memory) in situations where generalization is impossible, discovering a storage limit of approximately 3.6 bits per parameter.

Model Quantization

When exceeding model capacity,
Deep double descent phenomenon occurs, strengthening
Model Generalization instead of memorization. This appears similar to
Neural Network Phase Change and can be interpreted the same way as
AI Feature Dimensionality. (
Grokking)

Loss-based membership inference performance changes in a sigmoid pattern relative to model size and data size versus memorization ability

https://arxiv.org/pdf/2505.24832

Recommendations

////////