Mistral introduces Voxtral speech models to compete with closed APIs. The open-source models feature advanced speech recognition, multilingual support, and extensive context processing, offering a cost-effective solution. Two variants are available: a 24B model for production and a 3B model for edge deployments, both under the Apache 2.0 license. Voxtral includes a 32k token context length for long audio transcriptions and supports built-in question-and-answer functionality. These models demonstrate superior performance in various languages and tasks, enhancing real interactions and providing actionable insights.
Mistral's Voxtral speech models offer an alternative to closed APIs, combining high accuracy, multilingual support, and extensive context processing at competitive prices.
The Voxtral models include a 24B variant for production and a 3B variant for local use, both under the Apache 2.0 license for open use.
Voxtral features a 32k token context length, built-in question-and-answer functionality, and can generate structured summaries, enhancing user interactions and insights.
In benchmark tests, Voxtral Small consistently surpasses competitors like Whisper and achieves state-of-the-art results in multiple languages, proving its advanced capabilities.
Collection
[
|
...
]