OceanLite is an exascale supercomputer located in Wuxi, China built on the Sunway SW26010P (or “SW26010-Pro”1) processors.

System overview

OceanLite has 107,520,2 107,250,3 or 107,1364 compute nodes, each with a single SW26010P CPU and a tapered fat tree interconnect.

According to the system specification, it should have .4 It was used in a Gordon Bell Prize-winning paper2 that achieved 1.2 EF FP32 performance.

It is unclear how the nodes and racks are laid out.

Node architecture

Each compute node has:

  • 1x SW26010P CPU
    • 2.25 GHz4
    • 6x core groups (CGs)
      • 1x memory controller each
      • 16 GB DDR4 at 51.2 GB/s
      • 1x management processing element (MPE)
      • 1x 8x8 computing processing element (CPE) cluster (64 cores), each with a 512-bit vector unit
  • 390 cores (“processing elements”)
    • 6 core groups 64 PEs per core group = 384 cores
    • 6 core groups 1 management PE = 6 cores
    • 384 + 6 = 390 cores total
  • 96 GB DDR4 at 307.2 GB/s
  • 13.8 TF4 or 14.03 TF1 FP64
  • 27.6 TF4 or 14.03 TF1 FP32
  • 55.30 TF FP16 and BF161

Network architecture

Groups of 256 compute nodes are combined into supernodes that are interconnected as a non-blocking fat tree. These supernodes are connected in a second level fat tree with 16:3 oversubscription.1

In addition, OceanLite has a second network dedicated to I/O traffic.1

Footnotes

  1. BaGuaLu | Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming 2 3 4 5 6

  2. [2110.14502] Closing the “Quantum Supremacy” Gap: Achieving Real-Time Simulation of a Random Quantum Circuit Using a New Sunway Supercomputer 2

  3. How China Made An Exascale Supercomputer Out Of Old 14 Nanometer Tech

  4. China’s New(ish) SW26010-Pro Supercomputer at SC23 2 3 4 5