Compilers leverage the knowledge of computational graph to fuse operators mainly to eliminate memory loading operations. A common example is fusing batchnorm into the preceding convolution layer. Batch Normalization can be implemented as a matrix-vector operation, which can then be implemented as a 1×1 convolution.
github.com
https://github.com/kimbochen/md-blogs/tree/main/graph-compilers#operator-fusion

Seonglae Cho