Rubin's 50 PFLOPS FP4 and 22 TB/s HBM4 bandwidth are purpose-built for the era of deep reasoning models. Chain-of-thought inference with million-token context windows runs natively without the memory bottlenecks that constrain Blackwell-era deployments.
The Vera Rubin NVL72 platform delivers 3.6 ExaFLOPS per rack, enabling massive agentic AI systems that orchestrate hundreds of specialized models simultaneously. NVLink 6 at 3.5 TB/s per GPU ensures sub-microsecond inter-model communication.
With HBM4's doubled interface width delivering 22 TB/s bandwidth, Rubin enables scientific simulations at resolutions previously impossible — from full-atom protein folding to planetary-scale climate models with AI-augmented physics.
Rubin's massive compute and memory bandwidth power real-time world simulation — physically accurate digital twins of factories, cities, and autonomous vehicle environments running AI perception, planning, and rendering simultaneously.
| GPU Architecture | NVIDIA Rubin |
| Transistor Count | ~336 Billion (3nm Process) |
| Die Size | Dual-Die CoWoS-L (Reticle Limit x2) |
| Tensor Cores | 6th Gen Tensor Core Architecture |
| Memory Capacity | 288 GB HBM4 |
| Memory Bandwidth | 22.0 TB/s |
| NVLink Bandwidth | 3.5 TB/s (NVLink 6) |
| CPU Pairing | Vera CPU (88-Core Arm, 1.5TB LPDDR5X) |
| Form Factor | Vera Rubin Superchip / NVL72 |
| Thermal Design Power | 2300W |