Mixture of Multimodal Experts
MoME: Mixture of Multimodal Experts for Generalist Multimodal...
Multimodal large language models (MLLMs) have demonstrated impressive capabilities across various vision-language tasks. However, a generalist MLLM typically underperforms compared with a...
https://arxiv.org/abs/2407.12709


Seonglae Cho