Summarization

Compare cost for long inputs and short outputs. This is useful for reports, articles, notes, and other text-heavy work.

What matters most

Summarization is usually input-heavy. Low input price matters a lot, while output price matters less if the summary stays short.

This page starts with a 10,000-token document, a 500-token summary, and 1,000 documents per month.

Model	Input	Output	Monthly cost
Qwen2.5-Coder-3B-Instruct	$0.0100 / 1M tokens	$0.0300 / 1M tokens	$0.12
Qwen2.5-Coder-7B-Instruct	$0.0100 / 1M tokens	$0.0300 / 1M tokens	$0.12
Qwen2.5-Coder-7B	$0.0100 / 1M tokens	$0.0300 / 1M tokens	$0.12
llama3.2-11b-vision-instruct	$0.0150 / 1M tokens	$0.0250 / 1M tokens	$0.16
llama3.2-3b-instruct	$0.0150 / 1M tokens	$0.0250 / 1M tokens	$0.16
titan-embed-text-v2	$0.0200 / 1M tokens	N/A	$0.20
Llama-3.2-3B-Instruct	$0.0200 / 1M tokens	$0.0200 / 1M tokens	$0.21
paddleocr-vl	$0.0200 / 1M tokens	$0.0200 / 1M tokens	$0.21

Model

Input document tokens

Summary output tokens

Documents / month