Attention Matrix QKScaled dot product attention is most usual which are in Transformer architecture paperKey, Query Vector Similarity Attention Key Attention QueryAttention Score functionsDot-Product AttentionBahdanau AttentionMultiplicative AttentionAdditive Attention