BUILDING AN OPEN UZBEK SPEECH-TO-TEXT STACK FOR LOW-RESOURCE ASR
Keywords:
Uzbek ASR, low-resource speech, wav2vec 2.0, Whisper, Kaldi, data augmentation, subword modeling, lexicon-free decoding.Abstract
In this article we present an open, end-to-end speech-to-text stack for Uzbek that lowers the cost of building practical ASR in low-resource settings. The stack combines reproducible data pipelines, self-supervised pretraining, lightweight fine-tuning, and lexicon-free decoding. On public Uzbek splits, fine-tuned wav2vec 2.0 and Whisper outperform strong baselines. We release recipes, tokenizers, and evaluation scripts to enable fair, repeatable benchmarking and rapid local adaptation.
Downloads
Published
2025-09-23
Issue
Section
Articles
How to Cite
BUILDING AN OPEN UZBEK SPEECH-TO-TEXT STACK FOR LOW-RESOURCE ASR. (2025). International Conference on Scientific Research in Natural and Social Sciences, 32-35. https://econfseries.com/index.php/1/article/view/2889