CosmoFlow is a benchmark and workload that is a frozen artifact of what “scientific deep learning” looked like in 2018. It isn’t a real app and never has been; despite being developed at NERSC, it comprises 0% of the NERSC workload1 and it was developed to demonstrate the potential for deep learning in science,2 not because it is actually a useful model. Cynically, it was developed to win the 2018 Gordon Bell Prize (which it did not).

Despite this, it has become ingrained in various MLCommons benchmarks as an HPC I/O benchmark dressed up as a science problem.

To be clear, CosmoFlow has no scientific relevance today.

Footnotes

  1. N10_Workload_Analysis.latest.pdf

  2. CosmoFlow | Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis