Papers2

#forced alignment

Qwen3-ASR Technical Report

Qwen3‑ASR is a family of speech models that hear, understand, and write down speech in 52 languages and dialects, plus they can tell you when each word was spoken.

#ASR#forced alignment#timestamps

Not triaged yet

End-to-End Joint ASR and Speaker Role Diarization with Child-Adult Interactions

Intermediate

Anfeng Xu, Tiantian Feng et al.Jan 25arXiv

This paper builds one smart system that listens to child–adult conversations and writes what was said, who said it, and exactly when each person spoke.

#end-to-end ASR#speaker diarization#child speech

Not triaged yet