Conversational Speech Model
Conversational AIs
Conversational AI Notion

Hertz-dev
Introducing hertz-dev - Standard Intelligence
For the last few months, we at Standard Intelligence have been researching scalable cross-modality learning. We're excited to announce that we're open-sourcing current checkpoints of our full-duplex, audio-only base model, hertz-dev, with a total of 8.5 billion parameters and three primary parts:
https://si.inc/hertz-dev/
Full-Duplex-Bench
Paper page - Full-Duplex-Bench: A Benchmark to Evaluate Full-duplex Spoken Dialogue Models on Turn-taking Capabilities
Join the discussion on this paper page
https://huggingface.co/papers/2503.04721
schema
A Full-duplex Speech Dialogue Scheme Based On Large Language Models
We present a generative dialogue system capable of operating in a full-duplex manner, allowing for seamless interaction. It is based on a large language model (LLM) carefully aligned to be aware...
https://arxiv.org/abs/2405.19487

Beyond one-on-one
Beyond one-on-one: Authoring, simulating, and testing dynamic human-AI group conversations
Erzhen Hu, Student Researcher, and Ruofei Du, Interactive Perception & Graphics Lead, Google XR
https://research.google/blog/beyond-one-on-one-authoring-simulating-and-testing-dynamic-human-ai-group-conversations/

Full Duplex Model
Streaming Requests & Realtime API in vLLM
Large language model inference has traditionally operated on a simple premise: the user submits a complete prompt (request), the model processes it, and returns
https://vllm.ai/blog/streaming-realtime
STT를 넘고, Realtime STT, 그리고 곧 다가올 Full Duplex 모델 시대로
제 MBTI가 N이라서 그런지 개인적으로 미래 예측을 좋아하는데요, 다만 제 미래 예측을 믿고 판단 및 행동하는 편은 아닙니다.
https://monday9pm.com/stt%EB%A5%BC-%EB%84%98%EA%B3%A0-realtime-stt-%EA%B7%B8%EB%A6%AC%EA%B3%A0-%EA%B3%A7-%EB%8B%A4%EA%B0%80%EC%98%AC-full-duplex-%EB%AA%A8%EB%8D%B8-%EC%8B%9C%EB%8C%80%EB%A1%9C-7f50ed74f175


Seonglae Cho