Zen 5 | |
Produced-Start: | Mobile Desktop Server |
Designfirm: | AMD |
Manuf1: | TSMC |
Cpuid: | Family 1Ah |
Numcores: | Mobile: 8 to 12 Desktop: 6 to 16 Server: 16 to 192 |
L1cache: | 80KB (per core): |
L2cache: | 1MB (per core) |
Size-From: | TSMC N4X (Zen 5 CCD) TSMC N3E (Zen 5c CCD) TSMC N6 TSMC N4P (Mobile) |
Arch: | AMD64 (x86-64) |
Sock1: | Desktop |
Sock2: | Server |
Sock4: | Mobile |
Memory1: | DDR5 |
Extensions: | Crypto AES, SHA |
Extensions1: | SIMD MMX-plus, SSE, SSE2, SSE3, SSE4.1, SSE4.2, SSE4A, SSSE3, FMA3, AVX, AVX2, AVX512 |
Extensions2: | Virtualization AMD-V |
Pcode1: | Desktop |
Pcode2: | Thin & Light Mobile |
Pcode5: | Server |
Brand1: | Ryzen |
Brand2: | Ryzen AI |
Brand3: | Epyc |
Predecessor: | Zen 4 |
Successor: | Zen 6 |
Zen 5 is the name for a CPU microarchitecture by AMD, shown on their roadmap in May 2022,[1] launched for mobile in July 2024 and for desktop in August 2024.[2] It is the successor to Zen 4 and is currently fabricated on TSMC's N4X process.[3] Zen 5 is also planned to be fabricated on the N3E process in the future.[4]
The Zen 5 microarchitecture powers Ryzen 9000 series desktop processors (codenamed "Granite Ridge"), Epyc 9005 server processors (codenamed "Turin"),[5] and Ryzen AI 300 thin and light mobile processors (codenamed "Strix Point").[6]
Zen 5 was first officially mentioned during AMD's Ryzen Processors: One Year Later presentation on April 9, 2018.[7]
A roadmap shown during AMD's Financial Analyst Day on June 9, 2022 confirmed that Zen 5 and Zen 5c would be launching in 3nm and 4nm variants in 2024.[8] The earliest details on the Zen 5 architecture promised a "re-pipelined front end and wide issue" with "integrated AI and Machine Learning optimizations".
During AMD's Q4 2023 earnings call on January 30, 2024, AMD CEO Lisa Su stated that Zen 5 products would be "coming in the second half of the year".[9]
Zen 5 is a ground-up redesign of Zen 4 with a wider front-end, increased floating point throughput and more accurate branch prediction.[10]
Zen 5 was designed with both 4nm and 3nm processes in mind. This acted as an insurance policy for AMD in the event that TSMC's mass production of its N3 nodes were to face delays, significant wafer defect issues or capacity issues. One industry analyst estimated early N3 wafer yields to be at 55% while others estimated yields to be similar to those of N5 at between 60-80%.[11] [12] Additionally, Apple, as TSMC's largest customer, is given priority access to the latest process nodes. In 2022, Apple was responsible for 23% of TSMC's $72 billion in total revenue.[13] After N3 began ramping at the end of 2022, Apple bought up the entirety of TSMC's early N3B wafer production capacity to fabricate their A17 and M3 SoCs.[14] Zen 5 desktop and server processors continue to use the N6 node for the I/O die fabrication.[15]
Zen 5 CCDs are fabricated on TSMC's N4X node which is intended to accommodate higher frequencies for high-performance computing (HPC) applications.[16] Zen 4-based mobile processors were fabricated on the N4P node which is targeted more towards power efficiency. N4X maintains IP compatibility with N4P and offers a 6% frequency gain over N4P at the same power but comes with the trade-off of moderate leakage.[17] Compared to the N5 node used to produce Zen 4 CCDs, N4X can enable up to 15% higher frequencies while running at 1.2V.[18]
The Zen 5 CCD, codenamed "Eldora", has a die size of 70.6mm2, a 0.5% reduction in area from Zen 4's 71mm2 CCD while achieving a 28% increase in transistor density due to the N4X process node.[19] Zen 5's CCD contains 8.315 billion transistors compared to the Zen 4 CCD's 6.5 billion transistors.[20] The size of an individual Zen 5 core is actually larger than a Zen 4 core but the CCD has been reduced via shrinking the L3 cache. The monolithic die used by "Strix Point" mobile processors, fabricated on TSMC's lower power N4P node, measures 232.5mm2 in area.[19]
Zen 5's changes to branch prediction are the most significant divergence from any previous Zen microarchitecture. The branch predictor in a core tries to predict the outcome when there are diverging code paths. Zen 5's branch predictor is able to operate two-ahead where it can try to predict two code paths ahead before they are executed rather predicting one code path, waiting for it to be executed, then predicting the next one.[21] Two-ahead branch predictors have been discussed in academic research dating back to André Seznec et al.'s 1996 paper "Multiple-block ahead branch predictors".[22] 28 years after it was first proposed in academic research, AMD's Zen 5 architecture became the first microarchitecture to fully implement two-ahead branch prediction. Increased data prefetching assists the branch predictor.
Zen 5 contains 6 Arithmetic Logic Units (ALUs), up from 4 ALUs in prior Zen architectures. A greater number of ALUs that handle common integer operations can increase per-cycle scalar integer throughput by 50%.[23]
The vector engine in Zen 5 features 4 floating point pipes compared to 3 pipes in Zen 4. Zen 4 introduced AVX-512 instructions. AVX-512 capabilities have been expanded with Zen 5 with a doubling of the floating point pipe width to a native 512-bit floating point datapath. The AVX-512 datapath is configurable depending on the product. Ryzen 9000 series desktop processors and EPYC 9005 server processors feature the full 512-bit datapath but Ryzen AI 300 mobile processors feature a 256-bit datapath in order to reduce power consumption. AVX-512 instruction has been extended to VNNI/VEX instructions. Additionally, there is greater bfloat16
throughput which is beneficial for AI workloads.
The wider front end in the Zen 5 architecture necessitates larger caches and higher memory bandwidth in order to keep the cores fed with data. The L1 cache per core is increased from 64 KB to 80 KB per core. The L1 instruction cache remains the same at 32 KB but the L1 data cache is increased from 32 KB to 48 KB per core. Furthermore, the bandwidth of the L1 data cache for 512-bit floating point unit pipes has also been doubled. The L1 data cache's associativity has increased from 8-way to 12-way in order to accommodate its larger size.
The L2 cache remains at 1 MB but its associativity has increased from 8-way to 16-way. Zen 5 also has a doubled L2 cache bandwidth of 64 bytes per clock.
The L3 cache is filled from L2 cache victims and in-flight misses. Latency for accessing the L3 cache has been reduced by 3.5 cycles.[24] A Zen 5 Core Complex Die (CCD) contains 32 MB of L3 cache shared between the 8 cores. In Zen 5 3D V-Cache CCDs, a piece of silicon containing 64 MB of extra L3 cache is placed under the cores rather than on top like in prior generations for a total of 96 MB.[25] This allows for increased core frequency compared to previous generation 3D V-Cache implementations which were sensitive to higher voltages. The Zen 5-based Ryzen 7 9800X3D has a 500 MHz increased base frequency over the Zen 4-based Ryzen 7 7800X3D and allows overclocking for the first time.[26]
Ryzen AI 300 APUs, codenamed "Strix Point", features 24 MB of total L3 cache which is split into two separate cache arrays. 16 MB of dedicated L3 cache is shared the 4 Zen 5 cores and 8 MB is shared by the 8 Zen 5c cores.[27] Zen 5c cores are not able to access the 16 MB L3 cache array and vice versa.[28]
Cache | Zen 4 | Zen 5 | |
---|---|---|---|
L1 Data | Size | 32 KB | 48 KB |
Associativity | 8-way | 12-way | |
Bandwidth | 32B/clk | 64B/clk | |
L1 Instructions | Size | 32 KB | 32 KB |
Associativity | 8-way | 8-way | |
Bandwidth | 64B/clk | 64B/clk | |
L2 | Size | 1 MB | 1 MB |
Associativity | 8-way | 16-way | |
Bandwidth | 32B/clk | 64B/clk | |
L3 | Size | 32 MB | 32 MB |
Associativity | 16-way | 16-way | |
Bandwidth | 32B/clk Read 16B/clk Write | 32B/clk Read 16B/clk Write | |
Other features and changes in the Zen 5 architecture, compared to Zen 4, include:
L1/L2 BTB | 1.5K/7K | 16K/8K |
Return Address Stack | 32 | 52 |
ITLB L1/L2 | 64/512 | 64/2048 |
Fetched/Decoded Instruction Bytes/cycle | 32 | 64 |
Op Cache associativity | 12-way | 16-way |
Op Cache bandwidth | 9 macro-ops | 12 inst or fused inst |
Dispatch bandwidth (macro-ops/cycle) | 6 | 8 |
AGU Scheduler | 3x24 ALU/AGU | 56 |
ALU Scheduler | 1x24 ALU | 88 |
ALU/AGU | 4/3 | 6/4 |
Int PRF (red/flag) | 224/126 | 240/192 |
Vector Reg | 192 | 384 |
FP Pre-Sched Queue | 64 | 96 |
FP Scheduler | 2x32 | 3x38 |
FP Pipes | 3 | 4 |
Vector Width | 256 | 256b/512b |
ROB/Retire Queue | 320 | 448 |
LS Mem Pipes support Load/Store | 3/1 | 4/2 |
DTLB L1/L2 | 72/3072 | 96/4096 |
AMD announced an initial lineup of four models of Ryzen 9000 processors on June 3, 2024, including one Ryzen 5, one Ryzen 7 and two Ryzen 9 models. Manufactured on a 4 nm process, the processors feature between 6 and 16 cores.[31] Ryzen 9000 processors were released in August 2024.
The Ryzen AI 300 series of high-performance ultrathin notebook processors were announced on June 3, 2024. Codenamed Strix Point, these processors are named under a new model numbering system similar to Intel's Core and Core Ultra model numbering. Strix Point features a 3rd gen Ryzen AI engine based on XDNA 2, providing up to 50 TOPS of neural processing unit performance. The integrated graphics is upgraded to RDNA 3.5, and top end models have 16 CUs of GPU and 12 cores of CPU, an increase from the maximum of 8 CPU cores on previous generation Ryzen ultrathin mobile processors.[32] Notebooks featuring Ryzen AI 300 series processors were released on July 17.[33]
Alongside Granite Ridge desktop and Strix Point mobile processors, the Epyc 9005 series of high-performance server processors, codenamed Turin, were also announced at Computex on June 3, 2024. It uses the same SP5 socket as the previous Epyc 9004 series processors, and will pack up to 128 cores and 256 threads on the top-end model. Turin will be built on a TSMC 4 nm process.[34]
A variant of Epyc 9005 using Zen 5c cores was also shown off at Computex. It will feature a maximum of 192 cores and 384 threads, and be manufactured on a 3 nm process.
Zen 5c is a compact variant of the Zen 5 core, primarily targeted at hyperscale cloud compute server customers.[35] It will succeed the Zen 4c core.