Advanced Optimization Techniques for High Performance Fortran

Vikram Adve, Rob Fowler, Guohua Jin, Ken Kennedy and John Mellor-Crummey.
Rice University.

Abstract

With current commercial HPF compilers, users often have to restructure their codes extensively to obtain good performance, even for regular data-parallel applications on message-passing systems. The Rice dHPF compiler project is aimed at developing advanced optimization techniques that can provide consistently high performance for a broad spectrum of scientific applications with minimal restructuring of existing Fortran 77 or Fortran 90 applications. The foundation of the compiler is an abstract, integer-set-based approach for data-parallel program analysis and optimization. This framework has enabled us to implement a comprehensive collection of compile-time optimizations including advanced optimizations that either were never implemented before or implemented in a restricted form in a few compilers.

The project is currently focused on HPF versions of the NAS application benchmarks, which we have developed with minimal rewriting of the serial benchmarks (modifying less than 100 source lines per application). We have developed several essential compiler optimizations required by features common to such real-world benchmarks (in addition to the traditional data-parallel optimizations described in the literature). These optimizations include:

The dHPF project is also developing optimization techniques specific to (software and hardware-supported) distributed shared memory systems. These include aggressive data restructuring and buffering techniques and associated loop transformations to maximize data locality, exploiting explicit communication analysis (originally developed for message-passing) to minimize synchronization overhead by using lightweight point-to-point synchronization, and dynamic scheduling techniques that achieve load-balance while preserving data locality.

Notes by Chuck Koelbel