Recommend Embedding

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2024 Nov 22 23:8
Editor
Edited
Edited
2025 Sep 1 22:54
Refs
Refs
Recommend Embedding Models
 
 
 
 

limit
google-deepmindUpdated 2025 Sep 1 22:25

Single-vector embeddings have structural constraints in handling all top-k combinations for arbitrary instructional and inferential queries. A hybrid approach with multi-vector, sparse, and re-rankers is necessary. Intuitively, this is because compressing "all relationships" into a single point (vector) geometrically requires dividing the vector space into numerous regions. However, the number of regions is bounded by the sign-rank, while the number of regions needed to represent relationships increases by . Therefore, when the number of top-k combinations exceeds the "space partitioning capacity" allowed by the dimension, some combinations cannot be correctly represented.
This limitation is theoretically connected to the sign-rank of the qrel matrix A:
This means that given any dimension d, some qrel combinations cannot be represented. This limitation was empirically confirmed through Free embedding experiments (directly optimizing query/document vectors on test sets):
The proposed LIMIT dataset implements this theory in natural language, designed to create maximally dense combinations (increased graph density). State-of-the-art single-vector embeddings (GritLM, Qwen3, Gemini Embeddings, etc.) fail significantly with Recall@100 < 20%.
Hence sparse models: with high dimensionality (nearly infinite dimensions), they perform almost perfectly on LIMIT. Multi-vector approaches are better than single-vector but not a complete solution.
 
 
 

Recommendations