InfoNCE

Creator

Creator

Seonglae Cho

Created

Created

2026 Jan 16 0:51

Editor

Editor

Seonglae Cho

Edited

Edited

2026 Jan 26 15:16

Refs

Refs

Core Principle: Pull Positive Pairs Together, Push Others Apart

Step 1: Embed Both Text Batches

Step 2: Compute All Pairwise Similarities

Step 3: Define Positive and Negative Pairs

Diagonal entries represent positive pairs (emb1[i] matches emb2[i]), while off-diagonal entries serve as in-batch negatives.

Step 4: Optimize with Cross Entropy Loss

The loss maximizes diagonal similarities (positive pairs) while minimizing off-diagonal similarities (negative pairs).

Key Characteristics

In-batch negatives: Other samples in the batch automatically serve as negatives

Symmetric loss: Computed bidirectionally (anchor→positive and positive→anchor)

Learnable threshold: The model learns to distinguish between positive and negative pairs

Example Similarity Matrix

For batch [(a1,p1), (a2,p2), (a3,p3)]:

CachedInfoNCE

Recommendations

///////