Qdrant/bge-small-en-v1.5-onnx-Q

🧠 AI ModelQdrant

High-performance, quantized ONNX version of the BGE-small-en-v1.5 embedding model for efficient vector search.

The Qdrant/bge-small-en-v1.5-onnx-Q model represents a significant optimization of the original BGE-small architecture. By converting the model to the ONNX (Open Neural Network Exchange) format and applying quantization, Qdrant has created a version that is significantly more efficient for deployment in resource-constrained environments. This model excels at generating high-quality vector representations of text, making it ideal for semantic search, clustering, and retrieval-augmented generation (RAG) pipelines. It is fully compatible with Text Embeddings Inference (TEI) and standard ONNX runtimes, ensuring seamless integration into existing AI stacks. The model maintains the high accuracy of the original BGE-small-en-v1.5 while drastically improving inference speed, making it a go-to choice for developers prioritizing performance and scalability in their vector search applications.

💡Highlights

├─Optimized ONNX format
├─Quantized for low latency
└─650k+ downloads on HF

🎯For

├─AI Engineers
└─Backend Developers

🔗Links

└─HuggingFace Repository