What Kyutai does
Kyutai is a Paris-based open-science AI research lab focused on building and democratizing AI, with a strong emphasis on voice and speech. It releases its models and components as open source, concentrating on speech-native systems that process audio directly rather than only as transcribed text.
Key capabilities
- Moshi, a speech-native dialogue system for real-time spoken interaction
- Hibiki for real-time speech translation and Unmute for adding listening and speaking to LLMs
- Pocket TTS, a small text-to-speech model with voice cloning that can run in real time on CPU
- Mimi, a streaming neural audio codec, plus speech-to-text models and the Helium language model
Who it's for
Kyutai's open-source releases target developers and organizations building voice, speech, and multimodal AI applications. Because its components are freely available, it serves researchers and builders who want low-latency speech technology they can run and adapt themselves.