Intelligence Arena

AI Leaderboard for LLM Rankings

Last updated: | Data source: LiveBench benchmark data

The AI Leaderboard compares large language models across overall performance, coding, math, and reasoning categories, then displays searchable model rankings and category winners in a browser-based table.

The Intelligence Arena uses benchmark columns from LiveBench and groups them into practical comparison categories. Use the tabs to sort by overall score, coding, math, or reasoning, then search by model name to compare leading AI systems quickly.

How does the AI leaderboard calculate rankings?

The page parses benchmark CSV data, averages related benchmark columns into category scores, calculates a global average, and sorts models by the active category.

Category Included benchmark areas What it helps compare
Overall All numeric benchmark columns available in the dataset. Broad cross-domain model strength across the full table.
Coding Code completion, code generation, JavaScript, Python, and TypeScript scores. Software engineering, syntax, implementation, and programming task performance.
Math Math competition, AMPS Hard, and Olympiad-style scores. Advanced quantitative reasoning and mathematical problem solving.
Reasoning Zebra puzzle, spatial, connections, theory of mind, and plot unscrambling scores. Logical inference, pattern solving, spatial understanding, and narrative reasoning.
Connecting to LiveBench via proxy...
Connecting...

AI Leaderboard FAQ

What does the AI Leaderboard compare?

It compares large language models across overall performance, coding, math, and reasoning categories.

How are category scores calculated?

Related benchmark columns are grouped into categories, averaged, and used to sort the model table.

Can I search by model name?

Yes. The search field filters the loaded leaderboard by model name after the benchmark CSV loads.