nscale
Qwen2.5-Coder-3B-Instruct
In this workload, the estimated monthly API cost is $0.33. The route's listed context window is N/A.
Open model detailsData-led article
Coding workloads have longer prompts, more context, and larger outputs than a support chatbot. This screen prices a single-turn coding task: 2,000 input tokens, 1,500 output tokens, 50 tasks per user, and 100 monthly users.
Scenario
The numbers are generated from the site's current pricing data. Provider bills can differ because of cache behavior, discounts, regions, tiers, or provider-specific billing rules.
Prompt plus repo context or file snippets.
Code generation, explanation, or refactoring output.
Repeated coding tasks per month.
5,000 total tasks.
Cost screen
Rows are sorted by estimated monthly API cost. Open each model page before treating any route as production-ready.
| Model | Context | Input / 1M | Output / 1M | Monthly cost |
|---|---|---|---|---|
| Qwen2.5-Coder-3B-Instruct nscale | N/A | $0.0100 / 1M tokens | $0.0300 / 1M tokens | $0.33 |
| Qwen2.5-Coder-7B-Instruct nscale | N/A | $0.0100 / 1M tokens | $0.0300 / 1M tokens | $0.33 |
| Qwen2.5-Coder-7B nebius | 32.8K | $0.0100 / 1M tokens | $0.0300 / 1M tokens | $0.33 |
| llama3.2-11b-vision-instruct lambda_ai | 131.1K | $0.0150 / 1M tokens | $0.0250 / 1M tokens | $0.34 |
| llama3.2-3b-instruct lambda_ai | 131.1K | $0.0150 / 1M tokens | $0.0250 / 1M tokens | $0.34 |
| Llama-3.2-3B-Instruct deepinfra | 131.1K | $0.0200 / 1M tokens | $0.0200 / 1M tokens | $0.35 |
| paddleocr-vl novita | 16.4K | $0.0200 / 1M tokens | $0.0200 / 1M tokens | $0.35 |
| Meta-Llama-3.1-8B-Instruct-Turbo deepinfra | 131.1K | $0.0200 / 1M tokens | $0.0300 / 1M tokens | $0.43 |
nscale
In this workload, the estimated monthly API cost is $0.33. The route's listed context window is N/A.
Open model detailsnscale
In this workload, the estimated monthly API cost is $0.33. The route's listed context window is N/A.
Open model detailsnebius
In this workload, the estimated monthly API cost is $0.33. The route's listed context window is 32.8K.
Open model detailsIf your coding tasks use longer prompts, more iterations, or fewer monthly users, rerun the calculator with your own token counts.
Caveats
This page does not rank code-generation quality, latency, tool calling, IDE integration, repository understanding, or rate limits. Some low-cost routes may be specialized, gated, or inappropriate for production coding workflows. Use this as a pricing shortlist, then test the exact model route and verify final pricing with the provider.