Do not count input layer as perceptron (no weights & bias)Neural Network Layer NotionLayer NormalizationBatch NormalizationFully Connected LayerResidual ConnectionGradient NormalizationQuery Key Normalization