Function approximation with a non-trivial number of hidden layers
Deep Learning presents a challenge to classical statistical learning theory. Neural networks often achieve zero training error, yet they generalize well to unseen data. This contradicts traditional expectations and makes many classical generalization bounds ineffective.
Sparse activation and the Superposition Hypothesis have been proposed as possible explanations for the Grokking phenomenon, where models learn to activate sparsely and generalize well after initially overfitting when trained on very large datasets.
Deep Learning Notion
Deep Learning Usages
If you only thing you know is DNN, everything is gonna look like problems suited for DNN. In some cases, actually you might have different predictors more suitable.
If the only tool you have is a hammer, to treat everything as if it were a nail. - Abraham Maslow