Exploding gradient

When

Eigenvalue is larger than 1

Vanishing Gradient happens when eigenvalue is smaller than 1

Gradient information to be sufficiently passed through the network; Not too much (

Exploding gradient), not too little (

Vanishing Gradient typically occur due to non-linear components, though deep layers of linear transformations can also be problematic


with torch.autograd.set_detect_anomaly(True):
    loss.backward()