Register for a premium account to gain access to Sterling AI.
Get StartedThings you can ask Sterling:
Production inference platform for open-source AI; 4 to 9x faster than vLLM on high-traffic apps like Cursor and Notion
Annualized Revenue (May 2026)
~$800M
per Sacra; up from ~$305M (end-2025)
Valuation
$4B closed / $15B in talks
Oct 2025 Series C closed; $15B reported in talks (May 2026)
Inference Speed-up vs vLLM
4 to 9x
FireOptimizer kernel-level optimizations; cornerstone of competitive moat
Founder Pedigree
Lin Qiao (ex-Meta PyTorch lead)
Co-founded PyTorch at Meta; deep low-level systems credibility
Total raised: ~$330M+ · Latest valuation: $4B (Oct 2025 closed); ~$15B reported in talks (May 2026)
| Round | Date | Amount | Valuation | Key Investors |
|---|---|---|---|---|
| Seed | 2022 | Undisclosed | - | BenchmarkY Combinator |
| Series A | 2023 | $25M | - | Benchmark (lead)Y CombinatorDatabricks VenturesSnowflake Ventures |
| Series B | Jul 2024 | $52M | $552M | Sequoia (lead)BenchmarkDatabricks VenturesSnowflake Ventures+1 more |
| Series C | Oct 2025 | $250M | $4B | Lightspeed (co-lead)Index Ventures (co-lead)EvanticSequoia |
| New Round (reported) | May 2026 (reported) | In talks | $15B (reported) | Index Ventures (co-lead, reported); round not closed |
| Product | Type | Status |
|---|---|---|
| Fireworks Inference | High-throughput open-source model serving | GA |
| FireOptimizer | Custom kernel + speculative-decoding optimizations | GA |
| FireFunction | Function-calling model + framework | GA |
| F1 | Fireworks-developed reasoning model | GA (Sep 2025) |
| Fireworks Fine-Tuning + LoRA | Custom model adaptation services | GA |
| Compound AI Systems | Multi-model orchestration framework for agentic workflows | GA |
Cursor's Tab + completion latency depends on Fireworks; one of the highest-RPS AI products in the industry.
Notion AI built on Fireworks for low-latency LLM inference across collaborative documents.
High-traffic consumer + workflow AI products served via Fireworks.
Cross-data-cloud strategic positioning; Fireworks integrates with both data platforms.
Co-led the Oct 2025 Series C at a $4B valuation.
Lin Qiao (former Director of Engineering at Meta; PyTorch leader) and team start Fireworks
FireOptimizer (custom kernel-level optimizations) ships; 4 to 9x throughput improvements over open-source inference servers
Sequoia leads $52M; Fireworks is the inference layer behind multiple high-traffic AI products
Function-calling model + compound-AI system frameworks released; targets agentic workflows
Two of the highest-traffic AI products in the industry standardize on Fireworks for inference
Fireworks's first own-developed reasoning model; combines compound-AI + speculative decoding
$250M co-led by Lightspeed and Index Ventures
Reportedly in talks at a $15B valuation, nearly 4x the October mark; ARR reportedly ~$800M run-rate