GPT-5 Mini vs Gemini 3.1 Flash Lite vs Claude Haiku 4.5
These are three current compact models teams are likely to shortlist before they move up to flagship pricing. They overlap enough to compete in the same product conversation, but they diverge in where they spend money: Gemini on cheaper output, OpenAI on balanced tooling, and Claude on Anthropic-specific agent features.
Published May 30, 2026Compact model comparisonBuilt from site data
At a glance
Three current models, three different cost curves
These figures are based on official provider docs as checked on May 30, 2026. Extra tool charges, cache writes, grounded requests, and computer-use fees are not included in the sample monthly cost below.
Low-cost multimodal apps that need broad input support, lower output spend, and Google-native retrieval options.
Input
$0.25
Output
$1.50
Context
65,536
Sample monthly
$85.00
VisionAudio inputVideo inputURL context
What to watch: The output price is attractive, but the context window is smaller than GPT-5 Mini and the model surface is less conventional than Claude for agent workflows.
Sample workload: 100,000 requests per month, 1,000 input tokens, and 400 output tokens per request.
Model
Input / 1M
Output / 1M
Monthly cost
Read
GPT-5 Mini
OpenAI
$0.25
$2.00
$105.00
Balanced pricing and the broadest general-purpose default of the three.
Gemini 3.1 Flash Lite
Google
$0.25
$1.50
$85.00
Cheapest output cost here, with broad multimodal inputs and URL context.
Claude Haiku 4.5
Anthropic
$1.00
$5.00
$300.00
Most expensive in this workload, but strongest Anthropic-native agent surface.
Cheapest in this scenario
Gemini 3.1 Flash Lite
At $85.00, it stays ahead because its output price is lower than GPT-5 Mini and much lower than Claude Haiku 4.5.
Largest context
GPT-5 Mini
If your app keeps long threads or large retrieved chunks in one call, the extra headroom matters more than a small delta on input price.
Lowest input cost
GPT-5 Mini
GPT-5 Mini and Gemini 3.1 Flash Lite start from the same input rate, so the real split comes from output price and platform features.
How to choose
Pick by the pressure in your product
Choose GPT-5 Mini
When you want the cleanest general-purpose default: strong tool calling, prompt caching, web search, image input, and a 128k context window without moving up to full GPT-5 pricing.
Choose Gemini 3.1 Flash Lite
When your app writes a lot, or when multimodal input matters more than raw context size. It is the most economical writer here and the broadest on audio, video, and URL input.
Choose Claude Haiku 4.5
When you specifically want Anthropic's workflow surface, especially prompt caching, computer use, and PDF-heavy agent patterns. It needs that fit because the token bill is higher.
Sources
Where to verify the details
Use the provider docs for final pricing checks. This comparison keeps to the clean shared baseline of input and output token prices, but each provider adds its own rules for caching, tools, and grounded requests.