AI Software

Frontier Foundation Labs

The 12 to 15 labs producing frontier or near-frontier large language and multimodal models. Includes country, founding year, latest flagship model, most recent funding round, valuation, ARR where disclosed, open versus closed weights, and backer or distribution relationships. ARR is left null where labs have not publicly disclosed (most non-OpenAI / non-Anthropic labs).
#LabCountryFoundedLatest modelLast raiseValuationARR (latest)Open / ClosedBackers
1

OpenAI

US2015

GPT-5

2025-08

$6.6B (2025-10)$500B
$13Bas of 2025-09
closed

Microsoft (preferred compute partner, equity stake), Thrive Capital, SoftBank, Nvidia, Khosla Ventures, Tiger Global

2

Anthropic

US2021

Claude Opus 4.5 / Sonnet 4.5

2025-09

$5.0B (2025-08)$170B
$5.0Bas of 2025-07
closed

Amazon (multi-billion strategic, Bedrock distribution, Trainium/Inferentia compute), Google (multi-billion, GCP TPU compute), Lightspeed, Spark Capital, Fidelity

3

Google DeepMind

US/UK2010

Gemini 2.5 Pro

2025-03

n/an/an/dclosed

Wholly owned by Alphabet (GOOGL). DeepMind acquired by Google 2014; merged with Google Brain into Google DeepMind in April 2023.

4

xAI

US2023

Grok 4

2025-07

$10B (2025-07)$80Bn/dmixed

Valor Equity Partners, Andreessen Horowitz, Sequoia, Vy Capital, Fidelity, Saudi Kingdom Holding. Compute partnerships with Oracle (OCI) and on-prem Colossus cluster (Memphis). Grok models distributed via X (formerly Twitter) consumer app.

5

Meta AI / FAIR

US2013

Llama 4

2025-04

n/an/an/dopen-weights

Wholly inside Meta Platforms (META). FAIR founded 2013; Llama line distributed under Llama Community License (open-weights with commercial use clause for entities under 700M MAU).

6

Mistral AI

FR2023

Mistral Large 2

2024-07

$640M (2024-06)$6.2Bn/dmixed

General Catalyst, Andreessen Horowitz, Lightspeed, Microsoft (commercial partnership), NVIDIA, Salesforce, BNP Paribas, Bpifrance (France sovereign). Distribution via Azure AI Foundry, Bedrock, Vertex AI.

7

Cohere

Canada2019

Command A

2025-03

$500M (2024-07)$5.5B
$100Mas of 2024-12
mixed

PSP Investments, Cisco, Fujitsu, AMD, Inovia, Index Ventures, Tiger Global. Sales focus on enterprise on-prem and sovereign deployments via Oracle OCI partnership.

8

AI21 Labs

Israel2017

Jamba 1.6

2025-03

$208M (2023-11)$1.4Bn/dmixed

Walden Catalyst, NVIDIA, Intel Capital, Google, Pitango, SCB10X. Distribution via AI21 Studio plus Azure, Bedrock, Vertex.

9

DeepSeek

CN2023

DeepSeek-R1

2025-01

n/an/an/dopen-weights

Wholly funded by High-Flyer Quant (parent quantitative-trading firm); no external venture capital reported as of the as-of date. Compute on a self-operated A100 plus H800 cluster in China.

10

Zhipu AI (Z.ai)

CN2019

GLM-4.5

2025-07

$400M (2024-12)$3.0Bn/dmixed

Alibaba, Tencent, Meituan, Xiaomi, Hongshan (formerly Sequoia China), Hillhouse. Tsinghua University KEG lab spinout. Registered with the Cyberspace Administration of China (CAC).

11

Alibaba Qwen

CN2023

Qwen3 / Qwen 2.5 Max

2025-04

n/an/an/dmixed

Wholly inside Alibaba Cloud (BABA). Qwen 2.5, Qwen 2.5 Coder, Qwen3 weights released under Apache-2.0 in most sizes. Flagship Qwen 2.5 Max remains API-only on Alibaba Cloud Model Studio.

12

01.AI

CN2023

Yi-Lightning

2024-10

$200M (2024-05)$1.0Bn/dmixed

Alibaba Cloud, Hongshan, Sinovation Ventures. Founded by Kai-Fu Lee. Earlier Yi-34B and Yi-9B weights open under Yi Series License (commercial use allowed with registration).

13

Tencent Hunyuan

CN2023

Hunyuan-Turbo S

2025-02

n/an/an/dmixed

Wholly inside Tencent (TCEHY / 0700.HK). Hunyuan-Large MoE released open-weights Nov 2024 under Tencent Hunyuan Community License.

14

Reka AI

US2022

Reka Flash 3

2025-03

$60M (2024-09)$300Mn/dmixed

DST Global, Radical Ventures, Snowflake Ventures, Nvidia. Snowflake reportedly explored acquisition mid-2024 ($1B range) but deal did not close. Reka Flash 3 released open-weights under Apache-2.0 Mar 2025.

15

Black Forest Labs

Germany2024

FLUX.1.1 Pro Ultra

2024-11

$31M (2024-08)$1.0Bn/dmixed

Andreessen Horowitz, General Catalyst, MätchVC. Founded by ex-Stability AI researchers (creators of Stable Diffusion). FLUX models distributed via API plus partner platforms (xAI Grok uses FLUX for image generation in X).

Major foundation-model labs ranked by relative scale and ARR. Labs owned inside a public parent (Google DeepMind, Meta AI / FAIR, Alibaba Qwen, Tencent Hunyuan) show n/a for raises and n/d for ARR because financials consolidate into the parent. As of 2026-05-15. Hover the figures for round-lead and as-of detail.

Frontier Lab Revenue Trajectory

Quarterly annualized run-rate revenue for the labs that have disclosed numbers. OpenAI and Anthropic anchor the series; xAI, Mistral, and Cohere are sparse. Lines disconnect at null cells rather than drawing through them.

Milestone Disclosures

  • 2023-08OpenAI. Crossed $1B annualized revenue. (The Information)
  • 2024-02OpenAI. Crossed $2B annualized revenue. (Financial Times)
  • 2024-08OpenAI. Crossed $4B annualized revenue. (The Information)
  • 2024-12Anthropic. Crossed $1B annualized revenue. (The Information)
  • 2025-06OpenAI. Crossed $10B annualized revenue. (Reuters)
  • 2025-07Anthropic. Crossed $5B annualized revenue. (Bloomberg)

Quarterly annualized revenue run-rate (ARR), USD millions. OpenAI and Anthropic disclose periodically via press; xAI, Mistral, and Cohere disclose less frequently. Null cells reflect quarters with no fresh disclosure; the line breaks rather than interpolates. Click legend pills to toggle. Hover the points for per-quarter values.

Model Capability Benchmarks

A matrix of frontier models against a curated set of 6 to 10 benchmarks: MMLU-Pro, GPQA Diamond, SWE-Bench Verified, AIME 2025, HLE, Chatbot Arena Elo, MMMU-Pro, LiveCodeBench, ARC-AGI v2. Cells are color-graded by performance band; tooltips cite the eval version and source. Self-reported scores are flagged.
Model
MMLU-Pro%
GPQA Diamond%
SWE-Bench Verified%
AIME 2025%
HLE%
Arena Elo
MMMU-Pro%
ARC-AGI v2%

GPT-5

OpenAI

extended-thinking variant (high)

87.589.474.9*94.6*26.51,41079.4*6.4

Claude Opus 4.5

Anthropic

extended-thinking enabled

86.987.9*77.2*90.0*24.51,38876.5*4.6

Claude Sonnet 4.5

Anthropic

extended-thinking optional

83.483.4*77.2*87.0*19.81,37070.2*n/a

Gemini 2.5 Pro

Google DeepMind

deep-think mode enabled

86.786.4*67.2*86.7*21.61,39781.7*4.9

Grok 4

xAI

Heavy variant (parallel agents)

86.687.5*72.095.0*25.41,380n/a15.9

Llama 4 Maverick

Meta

no extended-thinking variant

80.5*69.8*43.463.08.41,27173.4*n/a

DeepSeek-R1

DeepSeek

reasoning model (think-aloud)

84.071.5*49.2*87.5*8.61,357n/a1.3

Qwen3 235B

Alibaba

hybrid thinking-mode toggle

81.871.1*47.081.5*11.81,331n/an/a

OpenAI o3

OpenAI

reasoning model (high effort)

85.087.7*71.7*88.9*20.31,37482.9*3.5

GLM-4.5

Zhipu (Z.ai)

thinking-mode toggle

84.6*79.1*64.2*91.0*n/a1,325n/an/a

Cells are color-graded by relative performance within each benchmark column (red lowest, green highest). Asterisk (*) marks scores self-reported by the lab; unflagged scores are from third-party leaderboards (Artificial Analysis, LMSys Chatbot Arena, Scale AI HLE, ARC Prize). Hover any cell for the eval version and source. As of 2026-05-15.

API Token Pricing

Current pricing per million tokens for input, output, and cached input across the major frontier APIs, sorted by input ascending. Includes context window and pricing-as-of date. Useful for comparing per-token economics across labs at a fixed quality tier.
ModelProviderInput$/MTokOutput$/MTokCached input$/MTokContexttokensAs of

Llama 4 Maverick

Meta (via partners)

$0.27$0.85n/a1.0M2025-04-08

DeepSeek-R1

DeepSeek

$0.55$2.19$0.1464K2025-02-08

Qwen3 235B

Alibaba Cloud

$0.60$1.80n/a128K2025-04-29

GLM-4.5

Zhipu (Z.ai)

$0.60$2.20n/a128K2025-07-28

GPT-5

OpenAI

$1.25$10.00$0.13400K2025-08-07

Gemini 2.5 Pro

Google

$1.25$10.00$0.311.0M2025-06-17

OpenAI o3

OpenAI

$2.00$8.00$0.50200K2025-06-10

Claude Sonnet 4.5

Anthropic

$3.00$15.00$0.30200K2025-09-25

Grok 4

xAI

$3.00$15.00$0.75256K2025-07-09

Claude Opus 4.5

Anthropic

$15.00$75.00$1.50200K2025-09-25

First-party API list prices, sorted by input price ascending. Color band is coarse (green = cheapest, red = most expensive). Output prices for reasoning models include billed thinking tokens. Llama 4 is Meta but priced via Together AI (Meta does not run a first-party API). Long-context surcharges apply on Claude and Gemini above 200K tokens. Hover the row for per-model pricing notes. As of 2026-05-15.

Token Price Decline Over Time

Log-scale view of dollars per million tokens by capability tier from 2023 through late 2025. Demonstrates the roughly 10x annual deflation in frontier API pricing as competition and architectural efficiency compound. Major release events annotated below the chart.
tier-A
tier-B (formerly tier-A)
tier-A reasoning
tier-A (deprecated)

Annotated Events

  • 2023-03-14GPT-4 launches at $30 / $60 ($/MTok)
  • 2023-11-06GPT-4 Turbo: 3x input cut
  • 2024-05-13GPT-4o multimodal: another 2x cut
  • 2024-06-20Claude 3.5 Sonnet undercuts Opus at 1/5 price
  • 2024-07-18GPT-4o mini: tier-A capability at $0.15/MTok input
  • 2024-12-26DeepSeek-V3 open-weights at sub-$0.30 input
  • 2025-01-20DeepSeek-R1: reasoning at GPT-4 Turbo class for $0.55 input. China selloff.
  • 2025-08-07GPT-5 flagship at $1.25 input: a 24x cut from GPT-4 original in 29 months

First-party API input pricing in $/MTok at the date of release, log scale on the y-axis. The frontier-equivalent input price has collapsed approximately 24x from GPT-4 launch (Mar 2023) to GPT-5 (Aug 2025). Hover any point for the release event and pricing detail. As of 2026-05-15.

Training Compute and Cost

Estimated training FLOPs (log scale) and dollar cost for frontier models from GPT-3 through GPT-5 / Grok 3, anchored on the Epoch AI Notable AI Models dataset. Toggle between FLOPs and cost. Cost cells are null for the most recent frontier runs where labs have not disclosed.

Not Disclosed (FLOPs)

Claude 3 OpusDeepSeek-R1Gemini 2.5 ProGPT-5Claude Opus 4.5

Pre-training compute (FLOPs) and rental-equivalent cost (USD) for frontier foundation models 2020 to 2025, log scale. Where the lab published a figure (Llama 3.1, DeepSeek-V3), the lab value is used; otherwise Epoch AI third-party estimates apply. Cost excludes R&D salaries, datacenter overhead, ablation runs, and post-training (RLHF, eval). Toggle the buttons to switch metric. Hover bars for source detail. As of 2026-05-15.

Open Weights vs Closed Labs

Side-by-side Chatbot Arena Elo rankings: top closed-API models (GPT-5, Claude 4.5, Gemini 2.5 Pro, Grok 4) against top open-weights models (DeepSeek-R1, Llama 4, Qwen 2.5 Max). Tracks the closing gap thesis that emerged after the DeepSeek-R1 release in January 2025.

Top Closed vs Top Open Gap (Chatbot Arena Elo)

53 Elo

53 Elo between #1 closed (GPT-5) and #1 open (DeepSeek-R1) at the October 2025 snapshot. Twelve months earlier (October 2024), the equivalent gap was approximately 100 Elo (GPT-4o vs Llama 3.1 405B). Two years earlier (October 2023), the gap was greater than 200 Elo.

Closed-API

(top 6)
#1

GPT-5

OpenAI (2025-08)

1410

Elo

#2

Gemini 2.5 Pro

Google DeepMind (2025-03)

1397

Elo

#3

Claude Opus 4.5

Anthropic (2025-09)

1388

Elo

#4

Grok 4

xAI (2025-07)

1380

Elo

#5

OpenAI o3

OpenAI (2025-04)

1374

Elo

#6

Claude Sonnet 4.5

Anthropic (2025-09)

1370

Elo

Open-Weights

(top 6)
#1

DeepSeek-R1

DeepSeek (2025-01)

1357

Elo

MIT (model weights); DeepSeek License (commercial use allowed)

#2

Qwen3 235B

Alibaba (2025-04)

1331

Elo

Apache-2.0

#3

GLM-4.5

Zhipu (Z.ai) (2025-07)

1325

Elo

MIT (open-weights tier)

#4

Llama 4 Maverick

Meta (2025-04)

1271

Elo

Llama 4 Community License (commercial use under 700M MAU)

#5

Qwen 2.5 Max

Alibaba (2025-01)

1253

Elo

API-only for Qwen 2.5 Max; Qwen 2.5 72B is Apache-2.0

#6

DeepSeek-V3

DeepSeek (2024-12)

1247

Elo

MIT

Inflection

2025-01-20

DeepSeek-R1 release. First open-weights reasoning model to enter the top 10 of the Arena leaderboard. Triggered an approximate $600B selloff in NVIDIA on 2025-01-27 ('China AI shock'). (Bloomberg 2025-01-27)

Closing-Gap Thesis

  • Open-weights models reached the top tier of the Arena leaderboard in early 2025 (DeepSeek-R1) and held position through October 2025.
  • The capability gap shrank from approximately 200 Elo (early 2024) to approximately 50 Elo (October 2025), an approximately 75 percent reduction in 18 months.
  • Pricing pressure: open-weights models are typically priced 5x to 20x cheaper than closed flagships at the same Arena Elo (DeepSeek-R1 at $0.55 / $2.19 vs GPT-5 at $1.25 / $10.00).
  • Implication: substitutability is rising. Enterprises increasingly deploy open weights on owned infrastructure or via neoclouds (Together, Fireworks) for cost reasons. Closed-lab pricing power remains in extended-thinking, multimodal, and agent-tool surfaces.
  • Cross-reference: this dynamic is one driver of the M&A and consolidation thesis covered in Tab 6 (Capital and M&A) of the AI Software sector.

Side-by-side Chatbot Arena Elo (LMSys) snapshot dated 2025-10-15. Arena updates daily; rankings shift after each release and after style-control recomputation. Higher Elo is better.

Ask Sterling

Register for a premium account to gain access to Sterling AI.

Get Started

Things you can ask Sterling:

Summarize Tesla's latest earnings reportWhy did NVIDIA's margins expand?Compare Apple vs Microsoft's cash flowWhat's driving the EV sector growth?
AI Software - Models | Sterling