AI Benchmark Hub methodology

AI Benchmark Hub combines live API catalog data with community benchmark signals so you can rank and compare frontier language models without a hidden composite score. Every metric on the leaderboard links back to a named source.

Quality index

Arena Elo mapped to 0–100.

When available, model quality comes from the LMArena text leaderboard (Elo). We normalize scores into a 0–100 quality index so you can blend quality with speed, cost, context, and safety on the weighted leaderboard.

Catalog & pricing

OpenRouter live API.

Model names, context windows, modalities, and per-token pricing are fetched from OpenRouter and cached for about one hour. Use compare for side-by-side specs or the arena to test behavior on real prompts.

Your-weights ranking

No vendor default formula.

Sliders control five dimensions: quality, speed, cost, max context, and safety moderation. The table re-sorts instantly and the URL encodes your weights so you can share the exact view with your team.

Transparency

Reviewable changes.

Methodology updates are documented on this page. If a source is temporarily unavailable, models still appear with the fields we can verify.

← Back to AI Benchmark Hub