LLM Pricing
← Back to articles

Data-led article

GPT-5 Mini vs Gemini 3.1 Flash Lite vs Claude Haiku 4.5

These are three current compact models teams are likely to shortlist before they move up to flagship pricing. They overlap enough to compete in the same product conversation, but they diverge in where they spend money: Gemini on cheaper output, OpenAI on balanced tooling, and Claude on Anthropic-specific agent features.

Published May 30, 2026 Compact model comparison Built from site data

At a glance

Three current models, three different cost curves

These figures are based on official provider docs as checked on May 30, 2026. Extra tool charges, cache writes, grounded requests, and computer-use fees are not included in the sample monthly cost below.

OpenAI

GPT-5 Mini

Model details

A balanced default when you want a current small model with strong tool support and predictable pricing.

Input
$0.25
Output
$2.00
Context
128,000
Sample monthly
$105.00
VisionFunction callingPrompt cachingWeb search

What to watch: It stays reasonable on output cost, but Gemini writes a bit cheaper and Claude offers more Anthropic-specific agent features.

Google

Gemini 3.1 Flash Lite

Model details

Low-cost multimodal apps that need broad input support, lower output spend, and Google-native retrieval options.

Input
$0.25
Output
$1.50
Context
65,536
Sample monthly
$85.00
VisionAudio inputVideo inputURL context

What to watch: The output price is attractive, but the context window is smaller than GPT-5 Mini and the model surface is less conventional than Claude for agent workflows.

Anthropic

Claude Haiku 4.5

Model details

Teams already leaning on Anthropic workflows such as prompt caching, computer use, and PDF-heavy assistants.

Input
$1.00
Output
$5.00
Context
64,000
Sample monthly
$300.00
VisionPrompt cachingComputer usePDF input

What to watch: It is materially more expensive than the other two in a write-heavy app, so it needs feature fit to justify the bill.

Open the live comparison

Jump into the compare tool with these three models already selected, or drill into the detail page for a single model.

Scenario

What the same workload costs

Sample workload: 100,000 requests per month, 1,000 input tokens, and 400 output tokens per request.

Model Input / 1M Output / 1M Monthly cost Read
GPT-5 Mini
OpenAI
$0.25 $2.00 $105.00 Balanced pricing and the broadest general-purpose default of the three.
Gemini 3.1 Flash Lite
Google
$0.25 $1.50 $85.00 Cheapest output cost here, with broad multimodal inputs and URL context.
Claude Haiku 4.5
Anthropic
$1.00 $5.00 $300.00 Most expensive in this workload, but strongest Anthropic-native agent surface.
Cheapest in this scenario
Gemini 3.1 Flash Lite

At $85.00, it stays ahead because its output price is lower than GPT-5 Mini and much lower than Claude Haiku 4.5.

Largest context
GPT-5 Mini

If your app keeps long threads or large retrieved chunks in one call, the extra headroom matters more than a small delta on input price.

Lowest input cost
GPT-5 Mini

GPT-5 Mini and Gemini 3.1 Flash Lite start from the same input rate, so the real split comes from output price and platform features.

How to choose

Pick by the pressure in your product

Choose GPT-5 Mini

When you want the cleanest general-purpose default: strong tool calling, prompt caching, web search, image input, and a 128k context window without moving up to full GPT-5 pricing.

Choose Gemini 3.1 Flash Lite

When your app writes a lot, or when multimodal input matters more than raw context size. It is the most economical writer here and the broadest on audio, video, and URL input.

Choose Claude Haiku 4.5

When you specifically want Anthropic's workflow surface, especially prompt caching, computer use, and PDF-heavy agent patterns. It needs that fit because the token bill is higher.

Sources

Where to verify the details

Use the provider docs for final pricing checks. This comparison keeps to the clean shared baseline of input and output token prices, but each provider adds its own rules for caching, tools, and grounded requests.