sentence-transformers/paraphrase-multilingual-mpnet-base-v2

🧠 AI Modelsentence-transformers

Multilingual sentence embeddings for semantic similarity across 50+ languages.

This model is based on XLM-RoBERTa and fine-tuned on a large multilingual paraphrase dataset. It outputs embeddings for text up to 514 tokens. Key features include: (1) Multilingual support for 50+ languages, (2) High performance on semantic textual similarity benchmarks, (3) Compatibility with sentence-transformers library for easy integration, (4) Exportable to ONNX and OpenVINO for production. The model uses mean pooling on top of the MPNet architecture, resulting in robust sentence-level representations. It is widely used in retrieval-augmented generation (RAG) systems, multilingual search, and cross-lingual NLP pipelines.

💡Highlights

├─Multilingual: 50+ languages supported
├─4.6M downloads, 465 likes on HuggingFace
└─Based on XLM-RoBERTa + MPNet architecture

🎯For

├─NLP Researchers
├─Developers of Multilingual Apps
└─Enterprise Semantic Search Teams

🔗Links

└─HuggingFace Model Card