jonatasgrosman/wav2vec2-large-xlsr-53-hungarian
🧠 AI Modeljonatasgrosman
Fine-tuned Wav2Vec2 Large XLSR-53 for Hungarian speech recognition.
jonatasgrosman/wav2vec2-large-xlsr-53-hungarian is an open-source automatic speech recognition model fine-tuned specifically for Hungarian. It builds upon Facebook's Wav2Vec2 Large XLSR-53, which was pretrained on 53 languages using self-supervised learning. The model was further fine-tuned on the Hungarian subset of the Mozilla Common Voice dataset. It uses a wav2vec2 architecture with a transformer encoder, achieving state-of-the-art results for Hungarian ASR. The model supports input audio sampled at 16kHz and outputs transcribed text. Key features include: fine-tuning with connectionist temporal classification (CTC), leveraging cross-lingual representations, and applicability in low-resource settings for Hungarian. It has gained significant traction with nearly 2 million downloads, indicating its reliability and performance in the community.
💡Highlights
- ├─Large XLSR-53 pretrained model
- ├─Fine-tuned for Hungarian ASR
- └─1.9M downloads on HuggingFace
🎯For
- ├─Speech recognition researchers
- ├─Hungarian NLP developers
- └─AI enthusiasts working on low-resource languages