Due to the repeated multiplication of weights
Gradient information to be sufficiently passed through the network; Not too much (Exploding gradient), not too little (Vanishing Gradient )
Exploding gradient나 Vanishing Gradient 보통 Non-linear component때문에 발생한다. 선형들의 Deep layers 도 문제긴 하지만