Compute Performance (FP4 Tensor)▲ 7.5x vs H100

15.0 PetaFLOPS

Dense FP4 performance with 3rd Gen Transformer Engine and 288GB HBM3e

B300 Blackwell Ultra15.0 PF

B200 Blackwell20.0 PF

H100 Hopper2.0 PF

Memory System

288GB HBM3e

12 Hi-Stacks / 8192-bit interface

8.0 TB/s Bandwidth

Interconnect & I/O

1.8 TB/s NVLink 5

Bi-directional total bandwidth

PCIe Gen 6.0 x16

Real-World Applications

Trillion-Parameter Model Training

With 288GB HBM3e — 50% more than B200 — the B300 enables training of trillion-parameter models with fewer GPUs. The expanded memory eliminates the need for aggressive model parallelism, reducing communication overhead and accelerating training time for frontier-scale models.

AI Reasoning & Long-Context Inference

Purpose-built for the era of AI reasoning, the B300 excels at chain-of-thought inference workloads where massive KV-cache memory is critical. A single DGX B300 delivers 192 PFLOPS for inference, enabling real-time reasoning at scale for agentic AI systems.

Exascale Scientific Computing

The GB300 NVL72 rack achieves 1.1 ExaFLOPS — true exascale in a single node. This enables climate simulations, drug discovery pipelines, and physics research at unprecedented resolution without requiring multi-rack interconnects.

Video Generation & World Models

The B300's 288GB memory capacity supports next-generation video generation models and world simulators that require massive context windows. Train and serve models that generate minutes of coherent video or simulate complex 3D environments in real time.

Full Technical Specifications

GPU Architecture	NVIDIA Blackwell Ultra
Transistor Count	208 Billion (4NP Process)
Die Size	Dual-Die CoWoS-L (Reticle Limit x2)
Tensor Cores	5th Gen (Enhanced)
Memory Capacity	288 GB HBM3e
Memory Interface	8192-bit (12-Hi Stacks)
Memory Bandwidth	8.0 TB/s
NVLink Bandwidth	1.8 TB/s
Form Factor	SXM6
Thermal Design Power	1400W (Liquid Cooled)

NVIDIA Blackwell Ultra B300