torch.nn.Embedding

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2024 Feb 24 7:59
Editor
Edited
Edited
2024 Mar 1 6:15
Refs
Refs
create token embedding table
  • number of embeddings
  • embedding dimension
  • padding_idx - index of padding token in the input indices
    • The main purpose of this parameter is to provide a way to ignore certain tokens during the embedding lookup, which is particularly useful for batch processing of sequences of varying lengths
    • the embedding vector for the padding index will not be updated during training.
    • This is the token id value not vector index
 
Increasing the vocabulary size to a multiple of 64 means that the data can be more easily divided into equally sized batches that align with the way memory is managed and computations are performed on GPUs.
# GPT-2 vocab_size of 50257, padded up to nearest multiple of 64 for efficiency vocab_size: int = 50304
return embedding tensor for input index tensor
 
 
 
 
 
 

Recommendations