SLT
Regular statistical models, which only work well when the parameter space is smooth and identifiable. Deep learning models like neural networks are singular models, where the parameter space is non-smooth, breaking conventional inference theories (such as Laplace approximation). The prediction error bound is expressed using learning coefficient instead of Kolmogorov dimension, but it necessarily requires the iid assumption
Sumio Watanabe argues that most modern machine learning models, such as deep learning and latent variable models, have a "singular" structure. In these models, singularities exist in the parameter space, causing conventional statistical theories (such as normal distribution assumptions and Fisher information-based theories) to fail. As a result, KL divergence cannot be approximated as a quadratic form as in standard theory, and MLE may diverge or the posterior may become non-normal.
To analyze this, Algebraic Geometry is used. Through Hironaka's resolution theorem, complex singularities are transformed into simpler forms, and their structure is mathematically analyzed. In this process, two key birational invariants emerge as core indicators explaining learning performance:
- RLCT (Real Log Canonical Threshold)
- Singular Fluctuation
These determine generalization performance and Evidence, among other properties.
Sumio Watanabe Homepage - Singular Learning Theory
If this is your first time encountering singular learning theory, we recommend starting here.
https://sites.google.com/view/sumiowatanabe/home/singular-learning-theory

Seonglae Cho