Parameter initialization
- weight to small random numbers
- bias (zero or small nonzero)
Ordered - Vanishing Gradient
Chaotic - Exploding gradient
Edge of Chaos
Therefore, when performing Weight Initialization, setting ensures that gradients are stably propagated, latent representations are both expressive and stable, and the network reaches the critical learning regime (edge of chaos).
Weight Initialization Usages

Seonglae Cho