30 Mar 2026 Mistral: Voxtral TTS, Forge, Leanstral, & what's next for Mistral 4 w/ Pavan Kumar Reddy & Guillaume Lample
Mistral's Voxtral TTS is a 3B-parameter text-to-speech model leveraging neural audio codecs, semantic/acoustic token splitting, and efficient flow matching for multilingual real-time applications, balancing quality and cost while exploring future refinements in architecture, tokenization, and domain-specific training.
![[NeurIPS Best Paper] 1000 Layer Networks for Self-Supervised RL Kevin Wang et al, Princeton thumbnail](https://assets.flightcast.com/V2Uploads/nvaja2542wefzb8rjg5f519m/01K4D8FB4MNA071BM5ZDSMH34N/square.jpg)