Multimodal Live API | Gemini API | Google AI for Developers
The Multimodal Live API enables low-latency, two-way interactions
that use text, audio, and video input, with audio and text output.
This facilitates natural, human-like voice conversations with the ability to
interrupt the model at any time. The model's video understanding capability
expands communication modalities, enabling you to share camera input or
screencasts and ask questions about them.
https://ai.google.dev/api/multimodal-live