REVELA

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2026 May 21 10:57
Editor
Edited
Edited
2026 May 21 11:13
Refs
Refs
Revela’s key idea is to introduce an in-batch attention mechanism to model inter-document dependencies within a batch. Similarity scores computed by the retriever are used as attention weights, so when predicting the current sequence the model can reference context from other relevant documents in the same batch. During training, the retriever learns a probability distribution over in-batch similarities and is jointly optimized together with the language model under the NTP objective.
Architecturally, Revela applies V-normalization in cross-document attention to prevent any single token from dominating, encouraging the model to focus on sequence-level semantic information. Unlike prior methods such as REPLUG, which compute LM perplexity for each document pair and thus incur complexity, Revela jointly processes all documents in a single forward pass, reducing training complexity to linear . This design yields strong scalability even with large batch sizes and large model sizes.
Empirically, Revela achieves an nDCG@10 score 2.8 points higher than the 7B-parameter supervised model E5-Mistral-7B-Instruct on the CoIR code-retrieval benchmark. It also surpasses previous state-of-the-art unsupervised retrievers such as Contriever and LaPraDoR on BEIR, establishing a new SoTA—while using roughly 1000× less training data and 10× less compute than prior approaches.
 
 
 
Revela: Dense Retriever Learning via Language Modeling
Dense retrievers play a vital role in accessing external and specialized knowledge to augment language models (LMs). Training dense retrievers typically requires annotated query-document pairs,...
Revela: Dense Retriever Learning via Language Modeling
trumancai/Revela-3b · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
trumancai/Revela-3b · Hugging Face
 
 

Recommendations