The Sandia Vanguard program is an extension of Sandia’s Advanced Architecture Testbed project under the Advanced Simulation and Computing program aimed at expanding the high-performance computing ecosystem by evaluating and accelerating the development of emerging technologies in order to increase their viability for future large-scale production platforms. The goal of the project is to reduce the risk in deploying unproven technologies by identifying gaps in the hardware and software ecosystem and making focused investments to address them. The approach is to make early investments that identify the essential capabilities needed to move technologies from small-scale testbed to large-scale production use.
Vanguard is filling the gap between node- or rack-scale test beds, which are vital for understanding the potential impact of cutting-edge hardware advancements, and the more proven hardware and software components that are more ready to be deployed in Advanced Technology Systems (ATS) or Commodity Technology Systems (CTS). Vanguard provides the ability to evaluate more realistic workloads beyond small-scale benchmarks and mini-apps, which is important for identifying aspects of the hardware, software, and application development environment that must be addressed in order to be considered for future deployment in production platforms. From a vendor perspective, Vanguard helps expand the set of technology choices for future systems, likely creating more competition among component and system providers. From an application perspective, the size and scale of the systems in the Vanguard project helps minimize the amount of effort required to get codes running effectively on new platforms stemming from Vanguard prototypes.
Astra – World’s First Petascale ARM-based Supercomputer
Astra is the first large-scale prototype system to be deployed under the Sandia Vanguard program. The system is composed of 2,592 compute nodes, each of which contains two sockets that are populated with 28-core Cavium ThunderX2 64-bit Arm-v8 processors. The theoretical peak performance of the machine is more than 2.3 petaflops. The platform will be deployed in partnership with Westwind Computing Products Inc. and Hewlett Packard Enterprise (HPE).
Advanced Tri-Lab Software Environment
The Advanced Tri-lab Software Environment (ATSE) has been initiated under the Sandia Vanguard program to address gaps in system software, compilers and tools, and libraries needed by the ASC application community. The priorities for ATSE are driven by the needs of the prototype systems, but ATSE also aims to serve as a vehicle for exploring new software technologies that may improve the overall ASC computing environment by enabling an open, modular, and extensible ecosystem that can be deployed and supported across all ASC platform types by a larger community of developers and vendors.