Voxtral Realtime
BeginnerAlexander H. Liu, Andy Ehrenberg et al.Feb 11arXiv
Voxtral Realtime is a speech-to-text model that types what you say almost instantly, while keeping accuracy close to the best offline systems.
#streaming ASR#real-time transcription#causal audio encoder