Qwen2.5-Coder-3B-Instruct
$0.12 / month
$0.0100 / 1M tokens input and $0.0300 / 1M tokens output per 1M tokens.
Summarization cost screen
Summarization workloads are usually input-heavy: the model reads a long document and writes a short summary. This page prices one narrow scenario from the summarization use-case page: a 10,000-token document, a 500-token summary, and 1,000 documents per month.
Scenario
This is a cost screen, not a quality benchmark. It is meant to help you find models worth testing in the calculator and compare pages.
Document
Long input price dominates the monthly bill.
Summary
Output price still matters, but less than the document read.
Volume
Multiply per-document cost by a modest monthly batch.
Cost screen
Prices come from the checked-in model database. Click a model to inspect its source and metadata before production use.
| Model | Input / 1M | Output / 1M | Monthly cost |
|---|---|---|---|
| Qwen2.5-Coder-3B-Instruct | $0.0100 / 1M tokens | $0.0300 / 1M tokens | $0.12 |
| Qwen2.5-Coder-7B-Instruct | $0.0100 / 1M tokens | $0.0300 / 1M tokens | $0.12 |
| Qwen2.5-Coder-7B | $0.0100 / 1M tokens | $0.0300 / 1M tokens | $0.12 |
| llama3.2-11b-vision-instruct | $0.0150 / 1M tokens | $0.0250 / 1M tokens | $0.16 |
| llama3.2-3b-instruct | $0.0150 / 1M tokens | $0.0250 / 1M tokens | $0.16 |
| Llama-3.2-3B-Instruct | $0.0200 / 1M tokens | $0.0200 / 1M tokens | $0.21 |
| paddleocr-vl | $0.0200 / 1M tokens | $0.0200 / 1M tokens | $0.21 |
| Meta-Llama-3.1-8B-Instruct-Turbo | $0.0200 / 1M tokens | $0.0300 / 1M tokens | $0.22 |
Shortlist
$0.12 / month
$0.0100 / 1M tokens input and $0.0300 / 1M tokens output per 1M tokens.
$0.12 / month
$0.0100 / 1M tokens input and $0.0300 / 1M tokens output per 1M tokens.
$0.12 / month
$0.0100 / 1M tokens input and $0.0300 / 1M tokens output per 1M tokens.
A 2,000-token note and a 50,000-token transcript can produce very different winners. Use the summarization page to change token counts, then compare the shortlisted models side by side.
Caveats
This page does not rank summary quality, factuality, citation behavior, latency, rate limits, prompt caching, discounts, regional pricing, or provider-specific add-on charges. Treat it as a cost-first shortlist, then run your own documents through candidate models and confirm final pricing with the provider.