jonatasgrosman/wav2vec2-large-xlsr-53-russian

🧠 AI Modeljonatasgrosman

Fine-tuned wav2vec2-large for Russian speech recognition with 2.5M downloads.

jonatasgrosman/wav2vec2-large-xlsr-53-russian is an open-source automatic speech recognition model fine-tuned specifically for the Russian language. It uses the wav2vec2-large-xlsr-53 architecture, which leverages self-supervised learning on unlabeled speech data and then fine-tunes for ASR. The model was fine-tuned on the Mozilla Common Voice 6.0 Russian dataset. It supports inference with PyTorch, JAX, and TensorFlow. Key features include a 300M parameter model, tokenization via Wav2Vec2CTCTokenizer, and competitive performance on the HF ASR leaderboard for Russian. With over 2.5 million downloads, it is a widely used resource for Russian speech recognition applications.

💡Highlights

├─300M parameter wav2vec2 model
├─Fine-tuned on Common Voice 6.0 Russian
└─2.5M+ downloads from HuggingFace

🎯For

├─Speech recognition researchers
├─NLP developers
└─Russian language applications

🔗Links

└─Model on HuggingFace