Code Generation Framework for Stencil Codes


Evolving Efficient and Generalizable Multigrid Methods with Genetic Programming


Higher-Order Discretizations for the Shallow Water Equations

The ExaStencils code generation framework processes input in its own multi-layered domain-specific language (DSL) ExaSlang to compose highly optimized and massively parallel geometric multigrid solvers on (block-)structured grids.


Domain-specific representation and modeling

The ExaStencils code generation framework consists of a source-to-source compiler written in Scala and a multi-layered external DSL called ExaSlang (short for ExaStencils language) tailored for stencil codes in general and multigrid solvers in particular. Each layer targets a certain user community and has a different degree of abstraction. This way, domain experts can formulate problems in a manner they are most familiar with, resulting in a separation of concerns and improved productivity.

With a rising layer number the DSL becomes more concrete, i.e. Layer 1 is the most abstract. In total, four layers exist:

  • ExaSlang 1: Continuous formulation of the problem. LaTeX-like syntax including specifications with Unicode symbols. Envisioned for users with only little interest in programming and numerical components.
  • ExaSlang 2: Discretization of the problem using Finite Differences (FD), Finite Volumes (FV) or Finite Elements (FE). Mostly used by domain experts for a certain application field, e.g. CFD.
  • ExaSlang 3: Definition of numerical solvers in a Matlab-like syntax. Different multigrid variants can be set up easily. Target users are (applied) mathematicians.
  • ExaSlang 4: Composition of whole program specifications. Data structures, parallelization schemes, data I/O and visualization are available for fine-tuning. Most frequently used by computer scientists.

The implementation of the problem can be done by providing different components of the problem and solver specification on different layers. Alternatively, users can work exclusively on one layer and have ExaStencils generate code for subsequent layers automatically. It is also possible to write the whole program entirely in Layer 4.

Platform Versatility

Applications generated from ExaSlang input can target different hardware platforms:

  • CPU
  • GPU
  • ARM
  • FPGA (not in production)
  • hybrid architectures

Parallelization Backends

ExaStencils provides an automatic parallelization and optimized communication schemes based on:

  • MPI
  • OpenMP (on CPUs)
  • CUDA
  • arbitraty combinations

Automatic Optimizations

ExaStencils provides code transformations to optimize performance. Amongst others, this includes:

  • Vectorization
  • Common subexpression elimination
  • Polyhedral loop transformations
  • Loop unrolling
  • Address pre-calculation
  • Arithmetic transformations

Supercomputing Power

The performance and scalability of codes generated by the ExaStencils framework have been demonstrated on multiple large-scale clusters:

  • JUWELS - Forschungszentrum Jülich (FZJ)
  • Piz Daint - Swiss National Supercomputing Centre (CSCS)
  • JUQUEEN - Forschungszentrum Jülich (FZJ)

Application Scope

ExaStencils goes beyond handling simplified benchmarking problems and has been used for simulations requiring solvers for the

  • Poisson
  • Stokes
  • Navier-Stokes
  • Shallow water
  • Hemholtz
equations, as well as multiple image processing applications.

For different applications, different specialized features are required:

  • Computational grids: (non-)uniform, (non-)axis-aligned, staggered
  • Different discretization techniques and boundary types
  • Solvers and their components: (non-)linear multigrid, point-, block- and Krylov-smoothers



The simulation of ocean currents, tides and coastal ocean circulation is an important field of research. To model them, the shallow water equations can be used. One efficient approach to discretize them is the discontinuous Galerkin method due to its parallelizability, its ability to use high-order approximation spaces, its robustness for problems with shocks but also due to its natural support for h- and p-adaptivity. GHODDESS (Generation of Higher-Order Discretizations Deployed as ExaSlang Specifications) is a Python-based front-end to ExaSlang providing capabilities to express these components as well as whole ocean simulations in a compact way. It builds on the symbolic algebra package SymPy combined with domain-specific abstractions for the application at hand. Simulation specifications are automatically mapped to corresponding ExaSlang variants. From here on, the full power of the ExaStencils toolchain can be used to gerenerate optimized and parallelized code.


Since the construction of efficient Multigrid solvers for the (non-)linear systems arising from the discretization of partial differential equations can be a challenging task, EvoStencils presents an approach to tackle this problem using a genetic program optimization technique. Basically, a solver is represented by a tree of mathematical expressions which is generated based on a tailored grammar. Each solver's quality is evaluated depending on two objectives, namely compute performance and convergence rate. The former is estimated using the roofline model, whereas for the latter local fourier analysis is applied. This multi-objective optimization is done via grammar-guided genetic programming and evolution strategies. With this approach, efficient and scalable Multigrid solvers can be automatically constructed. EvoStencils is also implemented in Python and utilizes the DEAP framework for the implementation of evolutionary algorithms.


ExaStencils was funded by the German Research Foundation (DFG) as part of the Priority Programme 1648 (Software for Exascale Computing) for a total of six years.

GHODDESS was funded by the German Research Foundation (DFG) for a total of three years.

Project Particulars

LFA Lab: A library for convergence rate prediction using local Fourier analysis.

Multigrid methods have many parameters that influence the execution time of the method. To identify a good parameter configuration, we need to predict the execution time of the method for various parameter configurations.

It is not sufficient to predict the time of the individual operations that a method performs, but it is also necessary to predict their number. Multigrid methods are iterative, i.e., they compute a sequence of iterations that converge towards the solution sought. The number of iterations needed to reach a certain accuracy depends on the convergence rate of the method.

The parameters of a multigrid method influence its convergence rate. The convergence rate is not available directly, but it can be estimated using local Fourier analysis.

As part of project ExaStencils, we developed LFA Lab, a flexible software library that performs a local Fourier analysis. LFA Lab takes as input the formula for the error propagator of the method and a description of the operations involved in this formula. Using this information, LFA Lab is able to predict the convergence rate of a given multigrid method.

By combining the convergence rate estimate with a performance model, it becomes possible to predict the execution time of a multigrid method for a given parameter configuration. This knowledge can then be used to identify an optimal parameter configuration.

Github Project