Home / Hardware / AMD / CDNA 3 / MI325X

AMD Instinct MI325X

Arch: CDNA 3Production ReadyTDP 1000W
Compute Performance (FP8)▲ 1.3x vs H100 FP8
2.6 PetaFLOPS
Peak FP8 matrix performance with 256GB HBM3e — the memory-optimized CDNA 3
MI325X CDNA 32.6 PF
H100 Hopper2.0 PF
MI300X CDNA 35.3 PF
Memory System
256GB HBM3e
8 Hi-Stacks / 8192-bit interface
6.0 TB/s Bandwidth
Interconnect & I/O
896 GB/s Infinity Fabric
7 xGMI links, bi-directional
PCIe Gen 5.0 x16
Real-World Applications
Memory-Optimized LLM Serving

The MI325X's 256GB HBM3e — 33% more than MI300X — enables serving 70B+ parameter models with larger KV-caches and batch sizes. The upgraded memory runs at 6 TB/s, making it the ideal drop-in upgrade for existing MI300X infrastructure.

Cost-Effective AI Inference

Positioned as a direct competitor to NVIDIA H200, the MI325X delivers comparable FP8 performance with 81% more memory capacity. For inference-heavy deployments, this translates to serving more concurrent users per GPU at lower total cost.

Scientific HPC & Simulation

With the same 304 compute units as MI300X and 6 TB/s memory bandwidth, the MI325X accelerates memory-bound HPC workloads. The additional memory capacity enables larger simulation domains without out-of-memory constraints.

Open-Source AI Ecosystem

Fully compatible with the ROCm software stack and existing MI300X infrastructure, the MI325X is a drop-in upgrade that delivers more memory for organizations invested in the AMD open-source AI ecosystem — training Llama, Mistral, and other open models.

Full Technical Specifications
GPU ArchitectureAMD CDNA 3
Process NodeTSMC 5nm / 6nm (3D Chiplet)
Compute Units304
Stream Processors19,456
Matrix Cores1,216 (AI Accelerators)
Memory Capacity256 GB HBM3e
Memory Interface8192-bit
Memory Bandwidth6.0 TB/s
Infinity Cache256 MB
Form FactorOAM (OCP Accelerator Module)
Thermal Design Power1000W