Texonom
Texonom
/
Engineering
Engineering
/Data Engineering/Artificial Intelligence/AI Object/AI Agent/App Agent/
OmniParser
Search

OmniParser

Creator
Creator
Seonglae Cho
Created
Created
2024 Dec 1 1:57
Editor
Editor
Seonglae Cho
Edited
Edited
2025 Feb 19 21:0
Refs
Refs

Screen Parsing (a simplest eye of ai agent)

OmniParser
microsoft • Updated 2025 Feb 19 21:0
 
 
 
 

OmniParser V2

OmniParser V2: Turning Any LLM into a Computer Use Agent - Microsoft Research
Yadong Lu, Senior Researcher; Thomas Dhome-Casanova, Software Engineer; Jianwei Yang, Principal Researcher; Ahmed Awadallah, Partner Research Manager Graphic User interface (GUI) automation requires agents with the ability to understand and interact with user screens. However, using general purpose LLM models to serve as GUI agents faces several challenges: 1) reliably identifying interactable icons within the […]
OmniParser V2: Turning Any LLM into a Computer Use Agent - Microsoft Research
https://www.microsoft.com/en-us/research/articles/omniparser-v2-turning-any-llm-into-a-computer-use-agent/
OmniParser V2: Turning Any LLM into a Computer Use Agent - Microsoft Research
paper
TWITTER BANNER TITLE META TAG
TWITTER BANNER DESCRIPTION META TAG
https://microsoft.github.io/OmniParser/
model
microsoft/OmniParser · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
microsoft/OmniParser · Hugging Face
https://huggingface.co/microsoft/OmniParser
microsoft/OmniParser · Hugging Face
web demo
What will you build?
The next step of your coding journey starts here.
What will you build?
https://scrimba.com/s08johf0et/head
What will you build?
 
 

Recommendations

Texonom
Texonom
/
Engineering
Engineering
/Data Engineering/Artificial Intelligence/AI Object/AI Agent/App Agent/
OmniParser
Copyright Seonglae Cho