Large Concept Model

Creator
Creator
Seonglae Cho
Created
Created
2025 Jan 3 22:39
Editor
Edited
Edited
2025 Jan 8 22:9
Refs
Refs
SONAR

Language Modeling in a Sentence Representation Space

Since it utilizes the
SONAR
embedding space (frozen encoder), it is superficially independent of language and modality (since SONAR is a multimodal encoder). While it is fundamentally a Transformer, as a Diffusion-based LCM, it learns the conditional probability distribution of the next sentence embedding using a diffusion model.

Limitation

Since sentence embedding predictions involve too many possible sentence combinations, more training data and sophisticated modeling are needed to generate appropriate next sentences. This presents a limitation that requires expansion to both smaller and larger units beyond the sentence level. Additionally, it shares
SONAR
's limitations.
notion image
 
notion image
 
 
 
 
 

Recommendations