[HiPEAC-announce] Ph D proposal : Modelization of a Petaflop range architecture for LQCD
Andre Seznec
Andre.Seznec at irisa.fr
Tue Oct 14 08:41:45 CEST 2008
Within the framework of a nationally funded project PetaQCD, the CAPS
project-team at IRISA/INRIA in Rennes, France, is proposing a PhD
fellowship on
the following topic:
Modelization of a Petaflop range architecture for LQCD
Application context:
Lattice Quantum Chromodynamics (LQCD) is the theory for Nuclear and
Sub-nuclear Physics and it simulates the properties of the strong
interaction at a sub-nuclear scale, modeling the matter as a quark
crystal. The computational demands of Lattice QCD are enormous and have
not only played a role in the history of supercomputers but are also
helping to define their future.
The PetaQCD project aims at designing the hardware and software
architectures for a sustained performance of over a Petaflop for LQCD
simulations with large lattice sizes (up to 1283*256). The PetaQCD
project is a multi-disciplinary collaboration of teams combining
expertise on the Physics of QCD simulation, knowledge of the simulation
code HMC (for Hybrid Monte Carlo) already used by the European Twisted
Mass Collaboration (ETMC), experience on the construction of
supercomputers for QCD (through the apeNEXT project) and expertise on
optimization techniques for parallel architectures. The overall
objective of the PetaQCD projet is to optimize the HMC code for LQCD
simulation and build a few nodes of a supercomputer mock up able to
reach a sustained Petaflop performance for this code.
The current studies show that a 10- to 100-fold performance factor needs
to be reached, with currently available hardware, to achieve a sustained
Petaflop over a limited number of nodes (about 1000).
Ph.D. subject:
On the hardware side, the advance of the integration makes it
conceivable to attack grand challenge problems such as LQCD requiring
Petaflop range performances in the next decade. Within the next few
years, it will become possible to integrate on a single chip tens and
may be hundreds of powerful processors. These multi-core chips might be
coupled with very powerful accelerators, e.g. 2nd generation of GP-GPUs.
Another possible scenario is that these multi-core chips will feature
vector facilities (e.g., enhanced SSE or Altivec functionalities). A
single node system delivering peak performance in the Teraflop range
will be probably possible to design from off-the-shelf multi-cores
and/or accelerators within five years. Therefore, in any case Petaflop
scale problems such as LQCD will still require a large number of nodes
in the foreseeable future (thousands, may be even 10,000s).
The objective of the Ph.D study is to develop a new methodology to
explore the potential performance and power consumption of a multi
multi-core design. The will focus on the LQCD application and possibly
multiple versions of this application as benchmarks, thus leveraging the
knowledge of the whole PetaQCD consortium of the application. Memory
access and synchronization are known as the main bottlenecks for
performance in this application. Therefore, we will essentially focus on
modelizing the memory hierarchy behavior.
The objective is to be able obtain a first order comparison of different
design options for a LQCD machine based on off-the-shelf multi-cores or
multi-cores+accelerators designs, therefore guiding the dimensioning of
a dedicated machine for LQCD. The methodology should be able to be
adapted to the study of other massively parallel applications to
understand their performance behavior. It should also be useful in early
multi-core design phases to help to decide on internal die organization
such as number of cores vs cache size, hierarchical organization ...
We will first analyze the data bandwidth needed on each memory hierarchy
level on the main phases of our target applications, i.e. data/block per
instruction for each cache size. Then we will use these data to
extrapolate an analytical model representing the memory behavior of the
application for different hierarchical organizations of the multi-core
and multi-node architecture. The second point will be to propose a
methodology mixing analytical model and simulation to derive the
estimated performance from the analytical model of the memory bandwidth
demand of applications. Parameters that will be modelized will be the
hierarchical organization of the memory hierarchy, the various memory
level access latencies and the different memory bandwidth at each level.
References:
1) Matteo Monchiero and Ramon Canal and Antonio Gonzalez, Design space
exploration for multi-core architectures: a power/performance/thermal
view, ICS '06: Proceedings of the 20th annual international conference
on Supercomputing, pages 177—186.
2) Gianfranco Bilardi, Andrea Pietracaprina, Geppino Pucci, Sebastiano
Fabio Schifano, Raffaele Tripiccione: The Potential of On-Chip
Multiprocessing for QCD Machines. HiPC 2005: 386-397
3) white paper on "Computational resources for Lattice QCD 2010-2014"
http://theory.fnal.gov/theorybreakout2007/LatticeQCD2010-2014.pdf.
4) F. Belletti, S. F. Schifano, R. Tripiccione, F. Bodin, P. Boucaud, J.
Micheli, O. Pène, N. Cabibbo, S. Luca, A. Lonardo, D. Rossetti, P.
Vicini, M. Lukyanov, L. Morin, N. Paschedag, H. Simma, V. Morenas, D.
Pleiter, and F. Rapuano. Computing for LQCD: apeNEXT. Computing in
Science and Engineering, 8(1):18-29, 2006.
5) K. Z. Ibrahim, F. Bodin, and O. Pene. Fine-grained Parallelization of
Lattice QCD Kernel Routine on GPUs. First Workshop on General Purpose
Processing on Graphics Processing Units, Northeastern Univ., Boston, Oct
2007.
6) P. Vranas, M. A. Blumrich, D. Chen, A. Gara, M. E. Giampapa, P.
Heidelberger, V. Salapura, J. C. Sexton, R. Soltz, and G. Bhanot.
Massively Parallel Quantum Chromodynamics. IBM journal of research and
development, 52(1/2), 2008.
Financial support: INRIA doctoral fellowship, ~ 1500 euros net per month
Expected date: December 2008 or January 2009
Contact:
André Seznec
seznec at irisa.fr
IRISA/INRIA
Campus de Beaulieu
35042 Rennes Cedex
Tel: (33) 299847336
More information about the HiPEAC-announce
mailing list