Qwen/Qwen3.5-27B-FP8

🧠 AI ModelQwen

FP8-quantized 27B multimodal model from Qwen with 1M+ downloads.

Qwen3.5-27B-FP8 is a quantized variant of the Qwen3.5-27B base model, optimized using FP8 (8-bit floating point) precision to significantly reduce memory usage and accelerate inference while maintaining model quality. Built on the Qwen3.5 architecture with safetensors format, it supports both text and image inputs (image-text-to-text pipeline), making it suitable for multimodal conversational applications. The model is distributed under the Apache-2.0 license, allowing permissive commercial and research use. It is compatible with the transformers library and supports standard endpoints for deployment. The FP8 quantization makes this 27B parameter model more accessible for inference on hardware with limited VRAM, reducing the memory requirement approximately by half compared to its FP16 counterpart while preserving most of the original model's capabilities.

💡Highlights

├─27B params with FP8 precision quantization
├─Multimodal: image-text-to-text support
├─1M+ Hugging Face downloads
└─Apache-2.0 open-source license

🎯For

├─AI researchers
├─ML engineers
└─Open-source developers

🔗Links

└─Hugging Face Model Page