vanishing gradients, over-smoothing, and over-squashing all stem from the same cause, as demonstrated theoretically and experimentally in this research. GNNs suffer from much more severe gradient vanishing than RNNs due to the spectral contractive properties of the normalized adjacency matrix. This gradient vanishing is the root cause of over-smoothing, leading to a "fixed-point convergence" phenomenon where all node representations converge to zero.
arxiv.org
https://arxiv.org/pdf/2502.10818

Seonglae Cho