Google DeepMind has introduced Gemini 3.5 Live Translate, a new capability that brings near real-time, natural speech translation to Google AI Studio, Google Translate, and Google Meet. The feature is designed to preserve the fluidity and intonation of original speech, aiming to reduce the robotic quality common in conventional voice translation.

The model leverages Gemini 3.5's multimodal architecture to process audio and text simultaneously, enabling sub-second translation latency. DeepMind claims the system maintains speaker cadence and emotional tone, a significant improvement over traditional cascaded speech-to-text-then-translate pipelines. No specific benchmark figures were provided in the announcement.

Practical deployment spans three Google products: developers can integrate Live Translate via API in AI Studio, consumers access it in Google Translate for conversations, and Meet users gain real-time translated captions. The feature supports a limited set of language pairs at launch, though DeepMind did not specify exactly which ones.

This move positions Gemini 3.5 against OpenAI's Whisper and Microsoft's Azure Speech Translation, both of which offer real-time capabilities but often with more mechanical output. DeepMind's emphasis on natural intonation could differentiate it in enterprise and accessibility markets, though no comparative benchmarks were released.

Researchers and developers have reacted with cautious optimism. Some note that real-world performance will depend on accent robustness and background noise handling. DeepMind says the system is optimized for clean audio environments but is working on increased noise resilience.