As explored in the HiPEAC Vision 2019, energy is an issue affecting the entire compute continuum, from tiny devices at the edge to enormous data centres. In the push towards exascale computing, improving energy efficiency is a driving factor for a number of reasons, says Professor Martin Schulz, chair for computer architecture and parallel systems at Technische Universität München (TUM) and a member of the board of directors at Leibniz Supercomputing Centre (LRZ).
‘One of the primary challenges for exascale is the total cost and variability associated with the energy and power consumed, not only by the high-performance computing (HPC) system but by the infrastructure supporting it,’ explains Tapasya Patki, a computer scientist at the Center for Applied Science Computing and principal investigator of the ECP Power Steering project at United States’ Lawrence Livermore National Laboratory (LLNL). ‘Many potential exascale sites are bound by power constraints of around 20-30MW. There may also be external factors such as a shortage of electricity, natural disasters and/or government-issued mandates that limit the supply of power even further. For others, reducing electricity costs in order to improve purchasing power is a key motivation.’
In the race to greater performance using less energy, the focus is often on hardware; however, as Tapasya argues, system software also plays a crucial role in achieving higher throughput and better utilization in constrained scenarios. To achieve this, the HPC community needs to better understand the underlying technical aspects of power and energy management, she stresses. ‘For example, there is a misguided assumption that giving more power to an application will always improve its performance, and that enforcing a power cap will always slow it down. Although true for processor-bound applications, such as high-performance LINPACK, it does not apply to most scientific applications, which tend to be bound by memory, input/output or network usage.’
Another less understood aspect, according to Tapasya, is processor manufacturing variability, where processors with the exact same microarchitecture exhibit different power and performance characteristics. ‘This is attributed to the chip fabrication process, and several vendors, including Intel and IBM, have confirmed that such variability is expected to worsen in the future and at larger scales.’
The answer? A software stack that can steer power based on site requirements, application characteristics and dynamic behaviour – all in a vendor-neutral way, says Siddhartha Jana, an HPC research scientist at Intel and subteam co-lead in the global Energy Efficient HPC Working Group (EE-HPC WG). ‘HPC PowerStack is a community-wide consortium that started in 2016 with the aim of bringing together experts from academia, research laboratories and industry to design a holistic, extensible power management framework,’ explains Siddhartha.
PowerStack explores hierarchical interfaces between components at three specific levels: batch-job schedulers, job-level runtime systems and node-level managers, according to Siddhartha. ‘Site-specific requirements such as cluster-level power bounds, user fairness or job priorities will be translated as inputs to the job scheduler. The job scheduler will choose power-aware scheduling plugins, managing allocations across multiple users and diverse workloads,’ he says. ‘Such allocations will serve as inputs to a fine-grained, job-level runtime system that manages application ranks, in turn relying on a vendor-agnostic, node-level measurement and control mechanisms.’
An overview of the envisioned PowerStackPowerStack forms part of the wider United States Exascale Computing Project (ECP), which aims to develop an HPC ecosystem – system software, applications, platforms and computational science, along with workforce development – using a co-design approach. ECP comprises three different projects targeting different levels of the stack:
- Runtime system for application-level power steering, focusing on the safe execution and performance optimization of applications running in a power-constrained environment.
- Operating system and resource management for exascale, focusing on improving and augmenting the operating system (Argo) and associated resource management frameworks (Flux)
- Exascale performance application programming interface (API), focusing on designing a 'consistent interface and methodology' for monitoring hardware and software-based performance events.
Power management in ECP software stack: Runtime System for Application-Level Power Steering
Power management in ECP software stack: Argo – Operating System and Resource Management for Exascale
Power management in ECP software stack: Exa-PAPI – The Exascale Performance API‘PowerStack is perfectly aligned with the first two projects. In the runtime project, the aim is to extend the use of the Global Extensible Open Power Manager (GEOPM), an open-source, community-driven power management application,’ says Tapasya. ‘PowerStack is also actively collaborating with the ECP Argo and Flux teams to develop a more holistic power management stack. The vision is to make resource managers and job schedulers interoperable with power / performance management frameworks such as GEOPM,’ she adds.
As for the collaboration’s results so far, PowerStack has demonstrated that ‘HPC sites can improve system efficiency by overprovisioning their resources and incorporating a scalable power-aware resource management framework, demonstrated using the widely used Slurm workload manager’, according to Tapasya. ‘At application-node level, contributors have demonstrated anywhere from 5% to 30% of performance improvement depending on application design and architecture of power-constrained systems using GEOPM’, adds Siddhartha. The papers from IPDPS’18 and ISC’17 cited below provide more details on these results.
Critically, PowerStack solutions are tested across multiple HPC facilities to ensure that they cater to the needs of a range of global sites, says Siddhartha. 'We’ve been collaborating with the Energy Efficient HPC Working Group which led to the first global survey analysing the current solutions for HPC sites in France, Italy, Japan, Saudi Arabia, Germany, the United Kingdom, and the United States.’ Since 2016, the size of the PowerStack consortium has grown considerably, and participants include national labs, system integrators, chip vendors, job scheduler and resource management vendors, and academic institutions – see the full list of participants below.
‘We would like to invite more collaborators, and are actively planning forums during the ISC19 and SC19 timeframes,’ adds Professor Masaaki Kondo, a research lead at the Advanced Institute for Computational Science in RIKEN and an Associate Professor of the Graduate School of Information Science and Technology at the University of Tokyo.
Contact points for PowerStack include Martin Schulz (Technical University of Munich), Masaaki Kondo (University of Tokyo/RIKEN), Tapasya Patki (LLNL/ECP), Siddhartha Jana (EEHPC-WG), and Jonathan Eastep (Intel/GEOPM PI).
Website: powerstack.lrr.in.tum.de
To get involved:
- Mailing lists for announcements: [email protected]
- GitHub repository for open collaboration: gitlab.com/powerstack
- Slack channel for discussion: powerstack.slack.com
PowerStack participants:
- National labs: LLNL, LANL, Sandia, Argonne, Riken, STFC/Hartree, Cineca, LRZ, Grenoble
- System Integrators: Cray, Fujitsu, HPE, ATOS/Bull, IBM
- Chip Vendors: x86 (Intel, AMD), ARM, POWER (IBM)
- Job scheduler / Resource manager vendors: PBSPro (Altair), ALPS (Cray), Cobalt (Argonne), Flux (LLNL), LSF (IBM)
- Academia: TU-Munich, TU-Dresden, UniBo, SDU, Univ of Tokyo, LRZ
- Facility and Operations: EEHPC-WG (Energy Efficient HPC Working Group)
Further reading:
- The PowerStack Initiative (A Community-driven Effort) - EEHPC-WG Webinar Series, October 2018 (pdf link )
- OSTI Technical report, ‘A Strawman for an HPC PowerStack’, August 2018, (link )
- IPDPS’18, Proceedings, ‘Analyzing Resource Trade-offs in Hardware Overprovisioned Supercomputers’, 2018 (link )
- Power and Performance Optimization at Exascale - insideHPC, March 2018, (link)
- Energy efficiency and the software stack - insideHPC, December 2017 (link)
- A global survey of HPC center energy and power-aware job scheduling and resource management, November 2017 (link )
- ISC’17 proceedings, ‘Global Extensible Open Power Manager: A vehicle for HPC Community Collaboration on Co-Designed Energy Management Solutions’, June 2017 (link)
PowerStack Core Committee Members (alphabetically, left to right, top to bottom): Aniruddha Marathe (LLNL), Barry Rountree (LLNL), Carsten Trinitis (TUM), Christopher Cantalupo (Intel), Jonathan Eastep (Intel), Josef Weidendorfer (TUM), Martin Schulz (LRZ, TUM), Masaaki Kondo (RIKEN, Univ of Tokyo), Matthias Maiterth (LMU, Intel), Ryuichi Sakamoto (Univ. of Tokyo), Siddhartha Jana (EEHPC-WG, Intel), Tapasya Patki (LLNL, ECP)