Register for a premium account to gain access to Sterling AI.
Get StartedThings you can ask Sterling:
Designs and sells graphics processing units for gaming, professional visualization, data centers, and automotive.
Dominates AI training and inference with its GPU and data center accelerator platforms including H100, B200, and GB200. Commands ~89% of the AI accelerator market with $130B+ annual revenue run rate and a $3T+ market cap. Aims to power the entire AI compute stack from cloud training to edge inference, autonomous vehicles, and robotics.
The undisputed king of AI compute — designs the GPUs that train virtually every major AI model.
H100
Data Center GPUCurrent workhorse for AI training
B200 / GB200
Data Center GPUNext-gen Blackwell architecture
A100
Data Center GPUPrevious-gen, still widely deployed
DGX Systems
AI SupercomputerTurnkey AI training servers
CUDA / cuDNN
Software PlatformIndustry-standard AI dev ecosystem
DRIVE Platform
Automotive AISelf-driving compute platform
Market Share
~80% of AI accelerator market
Competitive Moat
CUDA software lock-in, 15+ years of ecosystem development
Key Risk
Customer concentration in hyperscalers; custom chip competition from Google TPU, Amazon Trainium
If you want to understand AI infrastructure, start here. NVIDIA's GPUs are the foundation of virtually every AI system — from ChatGPT to autonomous vehicles. Their dominance is unmatched in tech.
Key Milestones
Founded April 5 by Jensen Huang (ex-LSI Logic, AMD), Chris Malachowsky (Sun) and Curtis Priem (IBM, Sun) at a Denny's in San Jose with $40K; named from Latin 'invidia', initial focus on PC graphics accelerators.
Released NV1, NVIDIA’s first chip on Sega Saturn-style quadrilateral rendering; commercial flop that nearly bankrupted the company before SGS-Thomson and Sega rescue funding.
Released RIVA 128, the company’s first commercially successful 3D graphics chip; sold 1M units in four months and saved NVIDIA from the brink of insolvency.
IPO on NASDAQ January 22 at $12/share, raising $42M; small offering reflected modest revenue (~$160M) but funded the Riva and TNT2 graphics generations.
Announced GeForce 256 August 31 and shipped October 11; coined the term 'GPU' (Graphics Processing Unit) by integrating hardware transform-and-lighting on a single chip with 23M transistors.
Acquired struggling rival 3dfx Interactive for $112M after the Voodoo card maker filed bankruptcy; NVIDIA absorbed engineering talent and IP that powered the GeForce3.
GeForce3 launched with first programmable vertex and pixel shaders, the foundational architecture for every modern GPU; Microsoft selected the chip for the original Xbox console.
Unveiled GeForce 8800 GTX (G80) with unified shaders and first CUDA-capable architecture; laid the groundwork for general-purpose GPU computing and modern AI training.
Released CUDA 1.0 SDK February 15 for Windows and Linux; opened the GPU as a programmable parallel processor and seeded the scientific-computing and deep-learning communities.
Launched Tesla brand for HPC GPUs (Tesla C870/C1060) priced at $1,500-$1,700; first credible move beyond gaming into the data-center compute market.
AlexNet won ImageNet using two NVIDIA GTX 580 GPUs, dropping image-classification error rates from 26% to 15%; the watershed moment validated GPUs as the substrate of modern deep learning.
Tesla K80 launched, GK210 Kepler-based dual-GPU accelerator targeting deep learning workloads; the chip became the workhorse of early academic and hyperscaler ML training.
Tesla P100 unveiled at GTC 2016, first Pascal data-center GPU with HBM2 memory and NVLink interconnect; targeted deep-learning training and HPC at $5,000+ per GPU.
DGX-1 announced at GTC 2016 as first purpose-built AI supercomputer-in-a-box; eight P100 GPUs at $129K, with first system hand-delivered by Jensen Huang to OpenAI.
Tesla V100 with Volta architecture announced at GTC, introducing Tensor Cores for AI acceleration at 125 TFLOPS FP16; powered the GPT-2/GPT-3 training era at OpenAI and Microsoft.
Announced $6.9B acquisition of Mellanox to bolster networking for data-center AI fabrics; preempted Intel and Microsoft bids and locked in InfiniBand for next-gen GPU clusters.
Closed $7B Mellanox acquisition after Chinese antitrust clearance, gaining InfiniBand and Ethernet IP that became the spine of every NVIDIA reference AI cluster from DGX-2 onward.
A100 launched at GTC 2020 on TSMC 7nm with 40GB HBM2e and 19.5 TFLOPS FP32; introduced multi-instance GPU (MIG) and became the dominant LLM training chip through 2022.
Announced $40B agreement to acquire Arm Holdings from SoftBank; the deal would have given NVIDIA control of the world's most-used CPU instruction set but drew immediate global regulatory scrutiny.
Abandoned $40B Arm acquisition due to FTC and global regulatory blockades; SoftBank kept the $1.25B prepayment and prepared Arm for its 2023 IPO.
Hopper H100 announced at GTC March 22 — 80B transistors on TSMC 4N, 80GB HBM3 and Transformer Engine for FP8; became the de-facto LLM training chip and pushed lead times to 11+ months by mid-2023.
H100 entered full production with hyperscaler partner systems shipping in early Q4; Microsoft, AWS, Google and Oracle placed initial orders for 100K+ GPUs each.
US BIS export controls of October 7 banned A100/H100 sales to China; NVIDIA pivoted to A800/H800 China-specific SKUs with reduced interconnect bandwidth, preserving most China revenue.
OpenAI launched ChatGPT November 30, igniting a global AI compute demand surge that Jensen Huang called AI's 'iPhone moment'; data-center orders surged from <$4B/quarter to $14B+ within four quarters.
DGX H100 systems began shipping to enterprise customers, anchoring the first wave of post-ChatGPT AI buildouts; Microsoft Azure and Oracle Cloud took the first racks for OpenAI workloads.
Crossed $1T market cap on May 30 on a record AI revenue forecast (Q1 FY24 guide of $11B vs $7.2B consensus); joined Apple, Microsoft, Alphabet and Aramco in the trillion-dollar club.
October 17 BIS update banned A800/H800 China-specific chips and added performance-density rule covering future SKUs; NVIDIA began designing H20/L20 for the China market under the new caps.
H200 announced at SC23, first GPU with HBM3e memory at 141GB and 4.8 TB/s bandwidth; targeted Q2 2024 ship date and locked in SK Hynix as lead HBM3e supplier.
Surpassed $2T market cap intraday February 23, becoming the third US firm to close above the threshold; took just nine months from $1T to $2T, the fastest such doubling in market history.
Blackwell B200 and GB200 superchip unveiled at GTC March 18, 208B transistors with up to 20 PFLOPS FP4 per GPU; built on TSMC 4NP and used CoWoS-L packaging with 192GB HBM3e.
GB200 NVL72 announced at GTC, liquid-cooled rack of 36 Grace Blackwell superchips for trillion-parameter LLMs; 1.4 exaFLOPS FP4 compute per rack, priced at ~$3M each.
Surpassed $3T market cap June 5 and briefly became the world's most valuable company on June 18 at ~$3.34T, displacing Microsoft and Apple atop the global leaderboard.
GB200 NVL72 systems began shipping to AWS, Microsoft, Google and Oracle after multi-quarter ramp delays; CoWoS-L packaging yields finally crossed acceptable threshold in Q4.
Blackwell Ultra B300 unveiled at GTC March 18, 1.5x B200 performance with 288GB HBM3e and 15 PFLOPS dense FP4; targeted as bridge product before Rubin platform in 2026.
Took $5.5B Q1 charge on H20 China inventory after April BIS rule blocking last China-tailored Hopper SKU; Jensen Huang publicly criticized the rule as an own-goal helping Huawei.
Became first $4T public company on July 9, capping a 10x rally since ChatGPT launch; market cap doubled in 13 months from the $2T close in February 2024.
US Trump administration permitted resumed sales of H20 to China, reversing April ban after diplomatic deal; NVIDIA recovered partial inventory write-down via fresh shipments.
Crossed $5T market cap intraday October 29, the first company ever to reach that milestone; just 3.5 months after the $4T close, fastest such trillion-dollar leg in history.
Amazon EC2 P6-B300 instances with Blackwell Ultra became generally available, anchoring AWS AI fleet upgrades; instance prices started around $98/hour for 8x B300 nodes.
Began Rubin GPU production at TSMC on N3 node ahead of GTC 2026 mass announcement; SK Hynix HBM4 stacks shipped from September were paired in CoWoS-L packaging.
Vera Rubin platform announced at CES, Vera CPU + dual Rubin GPU in one package with NVLink 6 and ConnectX-9; positioned as full vertically-integrated AI factory replacement for Blackwell.
Vera Rubin entered full production at GTC April 2026; locked in 336B transistors, 50 PFLOPS per GPU and 288GB HBM4; AWS, Google, Microsoft and Oracle named as first cloud takers.