InfoNCE

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2026 Jan 16 0:51
Editor
Edited
Edited
2026 Jan 26 15:16
Refs
Refs

Core Principle: Pull Positive Pairs Together, Push Others Apart

Step 1: Embed Both Text Batches

Step 2: Compute All Pairwise Similarities

Step 3: Define Positive and Negative Pairs

Diagonal entries represent positive pairs (emb1[i] matches emb2[i]), while off-diagonal entries serve as in-batch negatives.

Step 4: Optimize with Cross Entropy Loss

The loss maximizes diagonal similarities (positive pairs) while minimizing off-diagonal similarities (negative pairs).

Key Characteristics

  • In-batch negatives: Other samples in the batch automatically serve as negatives
  • Symmetric loss: Computed bidirectionally (anchor→positive and positive→anchor)
  • Learnable threshold: The model learns to distinguish between positive and negative pairs

Example Similarity Matrix

For batch [(a1,p1), (a2,p2), (a3,p3)]:

CachedInfoNCE

 
 
 
 
 
 
 
 
 

Recommendations