cpaua
·2 min4

Gemini 3.5 Live Translate: Near Real-Time Speech-to-Speech AI

Gemini 3.5 Live Translate

A new audio model for speech-to-speech translation almost in real time

Continuously generates speech, lagging a few seconds behind the speaker, while the model automatically detects over 70 languages without manual setup, with multilingual input support out of the box

Preserves the speaker’s intonation, tempo, and pitch, is robust to noise, and works in any acoustic conditions

It’s surprising they only implemented this now, as if a real-time translator like this has long been missing from the creators of what is generally the world’s leading translator

A preview is currently available via the Gemini Live API and in Google AI Studio

It’s also available on LiveKit and Pipecat

An update in the Google Translate app on Android and IOS — plug in your headphones and test it

They’ve also rolled it out in Google Meet

A new feature for Android

The listening mode feature: you bring the phone to your ear like during a regular call and hear the translation directly through the earpiece speaker. It can replace a guide on a tour in a foreign language, and in general it’s convenient if you don’t have headphones on hand

Google Blog

No more studying to become translators

Share:
Author
cpaua

VibeCode blog admin. Writing about vibe coding, AI and open source.

Comments

To leave a comment, log in or sign up
Loading...

Related articles