torch.nn.Embedding

create token embedding table

number of embeddings

embedding dimension

padding_idx - index of padding token in the input indices

The main purpose of this parameter is to provide a way to ignore certain tokens during the embedding lookup, which is particularly useful for batch processing of sequences of varying lengths
the embedding vector for the padding index will not be updated during training.
This is the token id value not vector index

Increasing the vocabulary size to a multiple of 64 means that the data can be more easily divided into equally sized batches that align with the way memory is managed and computations are performed on GPUs.

return embedding tensor for input index tensor

torch.nn.Embedding

Recommendations