Gemini 3.1 Flash Lite

🧠 AI Modelgoogle

Google's high-efficiency multimodal model for low-latency, high-volume workloads.

Gemini 3.1 Flash Lite is Google's latest generally available high-efficiency multimodal model designed for low-latency, high-volume production workloads. It supports a wide array of input modalities including text, images, video, audio, and files (PDFs), while outputting text. The model boasts a context length of 1,048,576 tokens, enabling it to handle lengthy documents or multi-turn conversations. Pricing is highly competitive: $0.25 per million input tokens and $1.50 per million output tokens. Key features include structured outputs, reasoning capabilities, temperature control, and response formatting. This model is ideal for developers building cost-sensitive AI applications such as real-time chat, data extraction, and lightweight agentic workflows.

💡Highlights

├─1M token context
├─Multimodal input: text, image, video, audio
└─$0.25/M input tokens

🎯For

├─AI developers
├─cloud engineers
└─cost-conscious startups

🔗Links

└─OpenRouter page