Hidden vector, Thought vector
latent state, internal state
사람이 읽지 못하기 때문에 hidden이라고 함
The hidden state is computed based on the current input and the previous hidden state
이 프로세스는 순차적이었고 병렬화를 방해
Some Intuition on Attention and the Transformer
What's the big deal, intuition on query-key-value vectors, multiple heads, multiple layers, and more.
https://eugeneyan.com/writing/attention/


Seonglae Cho