AI Coding Agent
Problem for agent is Language Model Context Size
The advent of large language models could potentially reduce software development costs to nothing, sparking a rapid and diverse growth in software akin to the content boom or Cambrian explosion.

LLMs should be used in conjunction with other tools to prevent the human review process from becoming a bottleneck.
One approach to reinforcement learning involves generative and discriminative models, such as GAN. Typical high-level AI development follows this approach and requires automation. While images can be compared visually, it's much harder to evaluate text, code, and audio. Therefore, a good AI coding assistant should not just provide results, but should help by breaking tasks down into smaller, easily verifiable steps. In other words, the importance of verifiability aligns with Verifiable Reward, suggesting that larger units like code blocks or video clips should be gradually incorporated.
AI Coder Services
Codex
Opensource
Opensource
Claude Code
Opensource
Opensource
Gemini CLI
Opensource
Opensource
Jules
Opensource
Opensource
Github Copilot
Opensource
Opensource
Devin
Opensource
Opensource
Gemini Code Assist
Opensource
Opensource
Phind
Opensource
Opensource
LLM LS
Opensource
Opensource
Tabby ML
Opensource
Opensource
Continue Dev
Opensource
Opensource
Tabnine
Opensource
Opensource
Cody
Opensource
Opensource
CodeWhisperer
Opensource
Opensource
Codeium
Opensource
Opensource
Codey
Opensource
Opensource
Q Developer
Opensource
Opensource
Aider-chat
Opensource
Opensource
Meticulous
Opensource
Opensource
OpenCode
Opensource
Opensource
TurinTech
Opensource
Opensource
AutoDev
Opensource
Opensource
Mistral Vibe
Opensource
Opensource
AI Coding Agents
AI Coder Models
Element targeted annotation UX is important for alignment
AI Wireframe Tools
V0
Opensource
Opensource
Google Stitch
Opensource
Opensource
Bolt.new
Opensource
Opensource
Github Spark
Opensource
Opensource
Opensource
Opensource
tldraw
Opensource
Opensource
Design2Code
Opensource
Opensource
Screenshot to Code
Opensource
Opensource
Same dev
Opensource
Opensource
Opensource
Opensource
Reflex Build
Opensource
Opensource
Lovable
Opensource
Opensource
AI Wireframe Protocol - Framework agnostic
Code Quality leaderboard
LLM Leaderboard for Code Quality & Security | Sonar
Independent analysis of code generation quality, security, and maintainability for leading LLMs.
https://www.sonarsource.com/the-coding-personalities-of-leading-llms/leaderboard/

Design Arena
Design Arena on Twitter / X
🚨 3D Design Battle: DeepSeek-V3 vs Sonnet 4 @deepseek_ai ranks #1 on https://t.co/bqMbSX8tjH for 3D designOne-shot Prompt: Build a model of a globe pic.twitter.com/8Y0mOeaSqr— Design Arena (@designarena_ai) June 28, 2025
https://x.com/designarena_ai/status/1938806353956610217?_bhlid=cc2688dd7982d3ce603a956c8db41b54023505fb
Current limitations
- Stop Digging; Know Your Limits
- Mise en Place
- Scientific Debugging
- The tail wagging the dog
- Consistent formatting
- Read the Docs
- Use Static Types
AI Blindspots
Blindspots in LLMs I’ve noticed while AI coding. Sonnet family emphasis. Maybe I will eventually suggest Cursor rules for these problems.
https://ezyang.github.io/ai-blindspots/
Leaderboard
Big Code Models Leaderboard - a Hugging Face Space by bigcode
Discover amazing ML apps made by the community
https://huggingface.co/spaces/bigcode/bigcode-models-leaderboard
PR workflow integration Git workflow
Resolving code review comments with ML
https://blog.research.google/2023/05/resolving-code-review-comments-with-ml.html

Designing tools for developers means designing for LLMs too
Most large language models (LLMs) aren't great at using less popular frameworks.
Using LLMs to help LLMs build Encore apps – Encore Blog
How we used LLMs to produce instructions for LLMs to build Encore applications.
https://encore.dev/blog/llm-instructions

MLE-STAR: A state-of-the-art machine learning engineering agent
Jinsung Yoon, Research Scientist, and Jaehyun Nam, Student Researcher, Google Cloud
https://research.google/blog/mle-star-a-state-of-the-art-machine-learning-engineering-agents/


Seonglae Cho

