Name: AMD Instinct MI325X
Brand: AMD

Compute Performance (FP8)▲ 1.3x vs H100 FP8

2.6 PetaFLOPS

Peak FP8 matrix performance with 256GB HBM3e — the memory-optimized CDNA 3

MI325X CDNA 32.6 PF

H100 Hopper2.0 PF

MI300X CDNA 35.3 PF

Memory System

256GB HBM3e

8 Hi-Stacks / 8192-bit interface

6.0 TB/s Bandwidth

Interconnect & I/O

896 GB/s Infinity Fabric

7 xGMI links, bi-directional

PCIe Gen 5.0 x16

Real-World Applications

Memory-Optimized LLM Serving

The MI325X's 256GB HBM3e — 33% more than MI300X — enables serving 70B+ parameter models with larger KV-caches and batch sizes. The upgraded memory runs at 6 TB/s, making it the ideal drop-in upgrade for existing MI300X infrastructure.

Cost-Effective AI Inference

Positioned as a direct competitor to NVIDIA H200, the MI325X delivers comparable FP8 performance with 81% more memory capacity. For inference-heavy deployments, this translates to serving more concurrent users per GPU at lower total cost.

Scientific HPC & Simulation

With the same 304 compute units as MI300X and 6 TB/s memory bandwidth, the MI325X accelerates memory-bound HPC workloads. The additional memory capacity enables larger simulation domains without out-of-memory constraints.

Open-Source AI Ecosystem

Fully compatible with the ROCm software stack and existing MI300X infrastructure, the MI325X is a drop-in upgrade that delivers more memory for organizations invested in the AMD open-source AI ecosystem — training Llama, Mistral, and other open models.

Full Technical Specifications

GPU Architecture	AMD CDNA 3
Process Node	TSMC 5nm / 6nm (3D Chiplet)
Compute Units	304
Stream Processors	19,456
Matrix Cores	1,216 (AI Accelerators)
Memory Capacity	256 GB HBM3e
Memory Interface	8192-bit
Memory Bandwidth	6.0 TB/s
Infinity Cache	256 MB
Form Factor	OAM (OCP Accelerator Module)
Thermal Design Power	1000W