not yet for webrtc
Multimodal Live API | Gemini API | Google AI for Developers
The Multimodal Live API enables low-latency, two-way interactions
that use text, audio, and video input, with audio and text output.
This facilitates natural, human-like voice conversations with the ability to
interrupt the model at any time. The model's video understanding capability
expands communication modalities, enabling you to share camera input or
screencasts and ask questions about them.
https://ai.google.dev/api/multimodal-live

Gemini Live API
Google AI Studio on Twitter / X
https://t.co/MZz9dI3ws6— Google AI Studio (@GoogleAIStudio) September 23, 2025
https://x.com/GoogleAIStudio/status/1970545734736023564

Seonglae Cho