Singular Learning Theory

SLT

Regular statistical models, which only work well when the parameter space is smooth and identifiable. Deep learning models like neural networks are singular models, where the parameter space is non-smooth, breaking conventional inference theories (such as Laplace approximation). The prediction error bound is expressed using learning coefficient instead of Kolmogorov dimension, but it necessarily requires the

iid assumption

Sumio Watanabe argues that most modern machine learning models, such as deep learning and latent variable models, have a "singular" structure. In these models, singularities exist in the parameter space, causing conventional statistical theories (such as normal distribution assumptions and Fisher information-based theories) to fail. As a result, KL divergence cannot be approximated as a quadratic form as in standard theory, and MLE may diverge or the posterior may become non-normal.

To analyze this,

Algebraic Geometry is used. Through Hironaka's resolution theorem, complex singularities are transformed into simpler forms, and their structure is mathematically analyzed. In this process, two key birational invariants emerge as core indicators explaining learning performance:

RLCT (Real Log Canonical Threshold)

Singular Fluctuation

These determine generalization performance and

Evidence, among other properties.

Sumio Watanabe Homepage - Singular Learning Theory

If this is your first time encountering singular learning theory, we recommend starting here.

https://sites.google.com/view/sumiowatanabe/home/singular-learning-theory

Singular Learning Theory

SLT

Backlinks

Recommendations