?? NVIDIA Grace Hopper Superchip clusters have officially gone live, marking a paradigm shift in AI and high-performance computing (HPC). At ISC 2025 Hamburg, NVIDIA unveiled 9 operational clusters across France, Japan, and the UK, delivering over 200 exaflops of combined power. These systems leverage the groundbreaking NVLink-C2C interconnect to achieve 900 GB/s bandwidth between CPUs and GPUs - 7x faster than PCIe Gen5.
Redefining AI Infrastructure: The Grace Hopper Cluster Architecture
The newly activated clusters combine NVIDIA's Grace CPU (72 Arm Neoverse V2 cores) and Hopper GPU (144 streaming multiprocessors) into a single superchip. Each node packs 608 GB of unified memory - 512 GB of LPDDR5X for the CPU and 96 GB HBM3 for the GPU - accessible through hardware-coherent NVLink-C2C.
The NVLink-C2C Breakthrough
Traditional CPU-GPU systems struggle with PCIe bandwidth limitations (128 GB/s for Gen5 x16). NVIDIA's solution? NVLink-C2C creates a unified memory space across 256 superchips, enabling:
?? 900 GB/s bidirectional bandwidth between components
?? Hardware-level cache coherence without page migration
?? Atomic operations across 150 TB of pooled memory
Global Deployment: From Quantum Labs to Climate Research
Nine flagship installations showcase the cluster's versatility:
???? ABCI-Q (Japan)
Hybrid quantum-classical system using rubidium-atom QPUs for materials science simulations
???? Isambard-AI (UK)
168-node Phase 1 deployment achieving 4.25x speedup in pharmaceutical molecular docking
Industry Reactions
"Grace Hopper clusters aren't just faster - they're architecturally transformative. The memory unification allows researchers to think in exabytes rather than gigabytes."
- Tim Costa, NVIDIA's Quantum & HPC Director
The Software Stack: Democratizing 150TB Memory Access
NVIDIA's HPC SDK enables developers to leverage the cluster's full potential through:
?? CUDA Unified Virtual Memory across CPU/GPU
?? ISO C++/Fortran compliance without code modification
?? Quantum circuit simulation via cuQuantum
Key Takeaways
? 900 GB/s unified memory via NVLink-C2C
?? 9 operational clusters across 3 continents
?? 4.25x faster molecular docking in pharma research
?? 144 ARM cores + 544 tensor cores per superchip
?? Hybrid quantum-classical workflows enabled
See More Content about AI NEWS