LLM Pricing

Benchmark explorer

Benchmarks

Select one benchmark, then compare base model families within that benchmark only. Scores are source-linked evidence rows, not a universal leaderboard.

214 Benchmarks
1,304 Model routes
12,935 Route rows

This page intentionally avoids cross-benchmark ranking. Pick a benchmark first; the chart and table below only compare rows with that same benchmark label. Rows may come from official model cards, launch posts, papers, or benchmark operators.

MMLUSWE-bench VerifiedMistral 7B comparison tableHumanEvalGPQA DiamondArtificial Analysis Coding IndexArtificial Analysis Intelligence IndexArtificial Analysis Agentic Index

Benchmark results

Results are grouped by the selected benchmark.

Loading...

Family Score Metric Category Scope Routes Source
Loading benchmark rows...