Texonom
Texonom
/
Science
Science
/Mathematics/Math Field/Statistics/Statistical Model/Model Generalization/
Deep double descent
Search

Deep double descent

Creator
Creator
Seonglae Cho
Created
Created
2024 Sep 14 20:18
Editor
Editor
Seonglae Cho
Edited
Edited
2024 Oct 15 0:7
Refs
Refs
Preetum Nakkiran
Bias-Variance Trade-off
Grokking
In-context learning ability

Double descent of Generalization performance

  1. first descent
  1. overfitting
  1. second descent
notion image
notion image
 
 
 
 
 
 
Deep double descent
We show that the double descent phenomenon occurs in CNNs, ResNets, and transformers: performance first improves, then gets worse, and then improves again with increasing model size, data size, or training time. This effect is often avoided through careful regularization. While this behavior appears to be fairly universal, we don’t yet fully understand why it happens, and view further study of this phenomenon as an important research direction.
Deep double descent
https://openai.com/index/deep-double-descent/
Deep double descent
Deep Double Descent: Where Bigger Models and More Data Hurt
We show that a variety of modern deep learning tasks exhibit a "double-descent" phenomenon where, as we increase model size, performance first gets worse and then gets better. Moreover, we show...
Deep Double Descent: Where Bigger Models and More Data Hurt
https://arxiv.org/abs/1912.02292
Deep Double Descent: Where Bigger Models and More Data Hurt
 
 

Recommendations

Texonom
Texonom
/
Science
Science
/Mathematics/Math Field/Statistics/Statistical Model/Model Generalization/
Deep double descent
Copyright Seonglae Cho