jonatasgrosman/wav2vec2-large-xlsr-53-hungarian

🧠 AI Modeljonatasgrosman

Fine-tuned Wav2Vec2 Large XLSR-53 for Hungarian speech recognition.

jonatasgrosman/wav2vec2-large-xlsr-53-hungarian is an open-source automatic speech recognition model fine-tuned specifically for Hungarian. It builds upon Facebook's Wav2Vec2 Large XLSR-53, which was pretrained on 53 languages using self-supervised learning. The model was further fine-tuned on the Hungarian subset of the Mozilla Common Voice dataset. It uses a wav2vec2 architecture with a transformer encoder, achieving state-of-the-art results for Hungarian ASR. The model supports input audio sampled at 16kHz and outputs transcribed text. Key features include: fine-tuning with connectionist temporal classification (CTC), leveraging cross-lingual representations, and applicability in low-resource settings for Hungarian. It has gained significant traction with nearly 2 million downloads, indicating its reliability and performance in the community.

💡Highlights

├─Large XLSR-53 pretrained model
├─Fine-tuned for Hungarian ASR
└─1.9M downloads on HuggingFace

🎯For

├─Speech recognition researchers
├─Hungarian NLP developers
└─AI enthusiasts working on low-resource languages

🔗Links

└─Model Card