Gemini 3.1 Flash Lite
🧠 AI Modelgoogle
Google's high-efficiency multimodal model for low-latency, high-volume workloads.
Gemini 3.1 Flash Lite is Google's latest generally available high-efficiency multimodal model designed for low-latency, high-volume production workloads. It supports a wide array of input modalities including text, images, video, audio, and files (PDFs), while outputting text. The model boasts a context length of 1,048,576 tokens, enabling it to handle lengthy documents or multi-turn conversations. Pricing is highly competitive: $0.25 per million input tokens and $1.50 per million output tokens. Key features include structured outputs, reasoning capabilities, temperature control, and response formatting. This model is ideal for developers building cost-sensitive AI applications such as real-time chat, data extraction, and lightweight agentic workflows.
💡Highlights
- ├─1M token context
- ├─Multimodal input: text, image, video, audio
- └─$0.25/M input tokens
🎯For
- ├─AI developers
- ├─cloud engineers
- └─cost-conscious startups