Vision Language Model

Creator

Creator

Seonglae Cho

Created

Created

2023 Nov 1 8:30

Editor

Editor

Seonglae Cho

Edited

Edited

2026 Feb 13 14:37

Refs

Refs

Vision Model

Vision Transformer

VLM

In VLM, data is the bottleneck rather than model architecture, the multimodal field is now moving from "model-centric → data-centric"

VLM Notion

Visual Grounding

Visual distractor

Vision Language Models

Analysis

A Dive into Vision-Language Models

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

https://huggingface.co/blog/vision_language_pretraining

Leaderboard

Open VLM Leaderboard - a Hugging Face Space by opencompass

Discover amazing ML apps made by the community

https://huggingface.co/spaces/opencompass/open_vlm_leaderboard

Open VLM Leaderboard - a Hugging Face Space by opencompass

https://arxiv.org/pdf/2411.04996

Recommendations

//////