FugakuNEXT is the codename for the follow-on flagship supercomputer for Japan after Fugaku. It is slated to be deployed in 2029 for operations in 2030, and RIKEN is the development lead for the system. Its high-level goals include:
5x to 10x improvement in HPC application performance over Fugakumore than 50 EFLOPS for AI training (100-200 EFLOPS peak)50x-100x application speedup when using AI surrogates
Satoshi Matsuoka presented the following targets for simulation workloads at Salishan 2025:1
- Raw hardware performance gain: 10x - 20x
- Mixed precision or emulation: 2x - 8x
- Surrogates / PINN: 10x - 25x
- Total: 200x - 1000x or more over Fugaku (“Zettascale”)
The system performance requirement for the RFP is:2
Metric | CPU | GPU |
---|---|---|
FP64 vector | 48 PF | 3,000 PF |
FP16/BF16 matrix | 1,500 PF | 150,000 PF |
FP8 matrix | 3,000 PF | 300,000 PF |
Memory capacity | 10 PiB | 10 PiB |
Memory bandwidth | 8 PB/s | 800 PB/s |
In addition, they expect:
- Over 3,400 nodes
- 2:1 structured sparsity
- Less than 40 MW1
The storage subsystem will be two tiers:2
Tier | Architecture | Implementation | Bandwidth | IOPS | Capacity |
---|---|---|---|---|---|
First | Near-node local | Something like CHFS, BeeOND | Write memory in less than 1 minute | Open/close/stat file per process in under 1 second | 2x memory |
Second | Shared | Lustre, DAOS | 20% of first tier | 10% of first tier | 30x memory |
The project timeline is as follows:2
The details of the project were summarized on a digital poster at SC24:
Satoshi Matsuoka has been talking about their vision for FugakuNEXT since around 2022. The vision for its CPU is:3
Themes that may be relevant to a processor or node include:32
- 3D stacking of memory and logic (as depicted above)
- Silicon photonics
- Large SRAMs, a la AMD 3D VCache
- Specialized tensor core-like data paths for scientific motifs like stencils, convolution, FFTs
- CGRA instead of or in addition to SIMD
- Processing-in-memory (PIM)
The CGRA is called out as a “strong scaling accelerator” candidate, so perhaps the CPU socket will have tiles of general-purpose CPU cores as well as CGRA tiles.