Home / Hardware / NVIDIA / Hopper / H200

NVIDIA Hopper H200

Arch: HopperProduction ReadyTDP 700W
Compute Performance (FP8 Tensor)▲ 2.0x vs H100
3.9 PetaFLOPS
FP8 Tensor performance with Transformer Engine
H200 Hopper3.9 PF
H100 Hopper2.0 PF
A100 Ampere0.6 PF
Memory System
141GB HBM3e
8 Hi-Stacks / 5120-bit interface
4.8 TB/s Bandwidth
Interconnect & I/O
900 GB/s NVLink 4
Bi-directional total bandwidth
PCIe Gen 5.0 x16
Real-World Applications
Large Language Model Inference

The H200's 141GB HBM3e allows serving larger models entirely in GPU memory, eliminating costly model parallelism for models up to 70B parameters. Delivers nearly 2x the inference throughput of H100 for LLM workloads.

Recommendation Systems at Scale

Embedding tables for production recommendation models fit entirely in HBM3e memory, reducing latency by eliminating host memory round-trips. Ideal for real-time ad serving and content recommendation at hyperscale.

High-Performance Computing

The 4.8 TB/s memory bandwidth accelerates memory-bound HPC workloads including weather forecasting, seismic analysis, and computational chemistry simulations with up to 110x speedup over CPUs.

Multi-Modal AI Training

Train vision-language models and diffusion models with larger batch sizes thanks to expanded memory capacity. The H200 enables training runs on datasets combining text, images, and video without memory constraints.

Full Technical Specifications
GPU ArchitectureNVIDIA Hopper
Transistor Count80 Billion (4N Process)
CUDA Cores16,896
Tensor Cores4th Gen (528 cores)
Memory Capacity141 GB HBM3e
Memory Interface5120-bit
Memory Bandwidth4.8 TB/s
L2 Cache50 MB
Form FactorSXM5
Thermal Design Power700W (Configurable)