Voxtral: Open-Weights TTS Alternative to ElevenLabs by Mistral
Читати українськоюAn open alternative to ElevenLabs with open weights has appeared.
Voxtral is a speech synthesis (text-to-speech) model from Mistral:
- only 4 billion parameters
- 70 ms latency for voice agents
- voice cloning from 3 seconds of audio
- 9 languages + cross-lingual transfer
- 68.4% wins compared to ElevenLabs Flash v2.5
Open weights are available on Hugging Face.