BUILDING AN OPEN UZBEK SPEECH-TO-TEXT STACK FOR LOW-RESOURCE ASR

Authors

  • Sukhrob Avezov Sobirovich PhD, lecturer in the Department of Russian Language and Literature Bukhara State University Author

Keywords:

Uzbek ASR, low-resource speech, wav2vec 2.0, Whisper, Kaldi, data augmentation, subword modeling, lexicon-free decoding.

Abstract

In this article we present an open, end-to-end speech-to-text stack for Uzbek that lowers the cost of building practical ASR in low-resource settings. The stack combines reproducible data pipelines, self-supervised pretraining, lightweight fine-tuning, and lexicon-free decoding. On public Uzbek splits, fine-tuned wav2vec 2.0 and Whisper outperform strong baselines. We release recipes, tokenizers, and evaluation scripts to enable fair, repeatable benchmarking and rapid local adaptation.

Downloads

Published

2025-09-23

Issue

Section

Articles

How to Cite

BUILDING AN OPEN UZBEK SPEECH-TO-TEXT STACK FOR LOW-RESOURCE ASR. (2025). International Conference on Scientific Research in Natural and Social Sciences, 32-35. https://econfseries.com/index.php/1/article/view/2889