Implementing SPMD-HPF using SHPF and KeLP

John Merlin VCPC, and
Scott Baden, University of California at San Diego.


Some important non-uniform applications entail taking finite differences on an irregular collection of uniform meshes. Examples include adaptive mesh refinement and multiblock problems.

Such applications cannot readily be implemented purely in High Performance Fortran (HPF), as HPF is designed primarily for data parallelism over regular rectangular regions. However, since the components of an irregular data domain may themselves be regular, it is appropriate to apply HPF to the individual meshes, and to use an SPMD program written in another language to coordinate the HPF programs and to perform the required inter-block data movements, i.e. to `glue' the pieces together.

We have implemented such a `SPMD-HPF' programming system using two public domain software tools:

KeLP handles the array allocation and decomposition for an irregular set of blocks, and the data motion between them. It invokes HPF procedures on each block as a separate parallel computation.

Interfacing KeLP and SHPF

The main task involved in interfacing KeLP to HPF is to arrange for KeLP to pass arguments to HPF procedures in the required form. The details of how arguments are passed in HPF depend on which HPF compiler is used. With the SHPF compiler, each array argument is accompanied by an extra argument (not visible in the original HPF code) called a mapping descriptor, which contains complete information about the array's mapping (i.e. its distribution). This convention is used by most HPF compilers, although the details of how information is stored in the descriptor is compiler-dependent.

To accommodate this, new KeLP classes were derived from the existing KeLP classes to store the necessary extra information, e.g. the mapping descriptors (in the format required by SHPF). In addition, some macros were defined to assist in calling HPF routines from KeLP.

For example, the HPF subroutine:

    SUBROUTINE update (a, b)
      REAL a (100), b
      ...
    END
is called from KeLP as follows:
    HPF_SUBR(update) (HPF_ARRAY_ARG(array), & b);
The macro HPF_SUBR changes the subroutine name update in accordance with SHPF's `name-mangling' convention, and the macro HPF_ARRAY expands the KeLP object array into two comma-separated arguments - a data array and its mapping descriptor - as expected by SHPF-compiled HPF procedures. The address operator & is applied to scalar arguments such as b, since in Fortran arguments are passed by address rather than by value.

The same KeLP-HPF application interface should work with most HPF compilers. It is simply a matter of redefining the KeLP classes and macros suitably for the given HPF compiler.

The integration also involved enhancing SHPF's run-time system to support general MPI communicators, so that HPF code may be run within a dynamically defined subset of the processors assigned to the run. As well as allowing the concurrent execution of different HPF procedures, which is necessary for the KeLP-HPF implementation, this will also enable task parallelism within HPF, and is necessary to fully support some HPF features, such as PURE procedures in FORALL, INDEPENDENT loops and the tasking construct (an HPF-2 `Approved Extension').

Demonstration application

To demonstrate the system we have written a simple KeLP-HPF program to solve Poisson's Equation by red-black relaxation on a domain consisting of a union of rectangular blocks of various sizes. The number of blocks, their sizes and positions, and their allocation to processors are input at runtime. The red-black relaxation within each block is coded in HPF, and KeLP is used to exchange data between blocks and to invoke HPF concurrently on the different blocks. This program may be downloaded here (gzipped tarfile, 4.7 Kb).

For example, the following input data to the red-black program:

  5
  0   6      0  11     1  2     0
  5  16      0   6     2  1     2
  0  11     10  16     2  1     4
  0   6     15  21     1  1     6
  0  16     20  26     3  1     7
sets-up a domain consisting of 5 rectangular blocks as shown on the right, and distributes them (from bottom to top) over 2, 2, 2, 1 and 3 processors respectively (which means that each processor receives the same number of grid points, i.e. a 5 x 5 subgrid, so the computation is load-balanced).

Further information

Multiple data parallelism with HPF and KeLP. (Gzipped Postscript, 73 Kb).
J. H. Merlin, S. B. Baden, S. J. Fink and B. M. Chapman. Proc. HPCN'98, Amsterdam, Netherlands, April 21 - 23 1998. To appear.
An early version of this system was demonstrated at Supercomputing '96 in Pittsburg. The following slides were prepared to accomany this demo:
Managing multiple HPF invocations using SHPF and KeLP. (Gzipped Postscript, 88 Kb).
Slides for a poster to accompany the demo at Supercomputing '96
Source code for HPF-KeLP application. (Gzipped Postscript, 52 Kb).
Slides showing the source code of the demo program.

Please address comments and questions about this work to John Merlin, jhm@vcpc.univie.ac.at.


John Merlin (jhm@vcpc.univie.ac.at).
Last updated Mon Feb 9 1998.