google-bert/bert-base-multilingual-uncased
🧠 AI Modelgoogle-bert
Multilingual BERT model for masked language modeling across 102 languages.
This is the base uncased multilingual BERT model released by Google. It is trained on the top 102 languages with the largest Wikipedia using a masked language modeling objective. The model uses a vocabulary of 119K tokens and supports sequences up to 512 tokens. It has 12 layers, 768 hidden units, and 12 attention heads, totaling 110M parameters. The uncased version lowercases text, making it suitable for tasks where case is unimportant. It is available in safetensors format and compatible with Hugging Face Transformers, PyTorch, TensorFlow, and JAX. Widely used as a starting point for fine-tuning on multilingual tasks like text classification, NER, and question answering.
💡Highlights
- ├─110M parameters
- ├─102 languages supported
- └─uncased text processing
🎯For
- ├─NLP researchers
- ├─multilingual developers
- └─data scientists