Gemini 3.5 Live Translate: Near Real-Time Speech-to-Speech AI
Gemini 3.5 Live Translate
A new audio model for speech-to-speech translation almost in real time
Continuously generates speech, lagging a few seconds behind the speaker, while the model automatically detects over 70 languages without manual setup, with multilingual input support out of the box
Preserves the speaker’s intonation, tempo, and pitch, is robust to noise, and works in any acoustic conditions
It’s surprising they only implemented this now, as if a real-time translator like this has long been missing from the creators of what is generally the world’s leading translator
A preview is currently available via the Gemini Live API and in Google AI Studio
It’s also available on LiveKit and Pipecat
An update in the Google Translate app on Android and IOS — plug in your headphones and test it
They’ve also rolled it out in Google Meet
A new feature for Android
The listening mode feature: you bring the phone to your ear like during a regular call and hear the translation directly through the earpiece speaker. It can replace a guide on a tour in a foreign language, and in general it’s convenient if you don’t have headphones on hand
No more studying to become translators