Masked Self-Attention

Creator

Created

2023 Mar 7 13:56

Editor

Edited

2024 Feb 23 5:41

Refs

디코더 블럭에서 사용되는 특수한 Self-Attention

디코더는

Autoregressive 하기 때문에 이후단어 보지않고 예측해야

그래서 뒤에 보지 않도록 Masking한다

Self-attention enables the decoder to focus on different parts of the output generated so far.

디코더에서 출력 단어를 예측하는 매 시점마다, 단어와 연관이 있는 입력 단어 부분을 좀 더 집중