Citation
Functional decomposition using the force parallel programming language

Material Information

Title:
Functional decomposition using the force parallel programming language
Creator:
Cooksey, Robert Neale
Publication Date:
Language:
English
Physical Description:
xi, 66 leaves : illustrations ; 29 cm

Subjects

Subjects / Keywords:
FORTRAN (Computer program language) ( lcsh )
Decomposition method ( lcsh )
Decomposition method ( fast )
FORTRAN (Computer program language) ( fast )
Genre:
bibliography ( marcgt )
theses ( marcgt )
non-fiction ( marcgt )

Notes

Bibliography:
Includes bibliographical references (leaf 66).
General Note:
Submitted in partial fulfillment of the requirements for the degree, Master of Science, Computer Science.
General Note:
Department of Computer Science and Engineering
Statement of Responsibility:
by Robert Neale Cooksey.

Record Information

Source Institution:
|University of Colorado Denver
Holding Location:
Auraria Library
Rights Management:
All applicable rights reserved by the source institution and holding location.
Resource Identifier:
38329686 ( OCLC )
ocm38329686
Classification:
LD1190.E52 1997m .C66 ( lcc )

Full Text
FUNCTIONAL DECOMPOSITION USING THE
FORCE PARALLEL PROGRAMMING LANGUAGE
by
Robert Neale Cooksey
B.S., University of Colorado at Colorado Springs, 1985
A thesis submitted to the
University of Colorado at Denver
in partial fulfillment
of the requirements for the degree of
Master of Science
Computer Science
1997


This thesis for the Master of Science
degree by
Robert Neale Cooksey
has been approved
by
Gita Alaghband
illiain
me
~7- ^ 3- Q
Date


Cooksey, Robert Neale (M.S., Computer Science)
Functional Decomposition Using the Force Parallel Programming Language
Thesis directed by Professor Gita Alaghband
ABSTRACT
The Force is a parallel programming language that provides a set of parallel
extensions to the Fortran language, and is based on the global parallelism model in
that the parallelism of a Force program is present for the entire program execution.
The language is designed for large-scale shared memory multiprocessors, and has
been successfully implemented on a variety of different systems. The Force language
is implemented as a two-layer macro preprocessor, allowing it to be readily ported
from one system to another. The Force language has been ported to both the Silicon
Graphics Power Challenge and 0rigin2000 multiprocessors. This paper presents the
steps involved in porting the Force language to a new platform, and discusses the
main problems encountered during the port process to the Silicon Graphics systems.
The Silicon Graphics implementation of the Force language introduces a construct
that allows the parallelism within a Force program to be divided along function. This
functional separation of an application allows multiple parallel segments to be
executed in parallel with respect to each other, with each parallel segment being
executed within its own parallel environment. This paper explains how functional
parallelism has been implemented in the Force language by the Resolve construct.
This abstract accurately represents the content of the candidate's thesis. I recommend
its publication.
Signed
Gita Alaghbi


DEDICATION
To my wife Wendi.


ACKNOWLEDGEMENT
My thanks to Dr. Alaghband for her support and patience while working with me as
my thesis advisor; she has been both a mentor and a friend.
This work was partially supported by the National Center for Supercomputing
Applications under grant number ASC970023N and utilized the computer system
Silicon Graphics 0rigin2000 at the National Center for Supercomputing Applications,
University of Illinois at Urbana-Champaign.


CONTENTS
CHAPTER
1. INTRODUCTION ..................................... 1
2. THE FORCE PARALLEL PROGRAMMING
LANGUAGE ..........................................3
Force Language Concepts.........................3
Program Structure.........................4
Declaration of Variables..................5
Parallel Execution .......................6
Synchronization ..........................7
Force Parallel Environment ................... 10
Force Language Implementation................. 10
Parameterized Function Macros........... 11
Force to Fortran Translation............ 12
3. FORCE ON THE SILICON GRAPHICS PLATFORM........... 14
IRIX Operating System ........................ 14
Power Challenge Architecture ................. 14
0rigin2000 Architecture....................... 15
Node Boards ............................ 15
Distributed Shared-Memory............... 17
VI


4.4 Unify Macro Expansion.......................................43
4.5 Resolve Program Segment.....................................45
5.1 Selfsched DO Test Programs...................................49
X


TABLES
Table
2.1 Force Parallel Environment Variables ..................... 11
2.2 Low-Level Machine-Dependent Macros ....................... 12
4.1 Resolve Parallel Environment Variables ...................39
4.2 Resolve Test Program Results..............................46
5.1 Selfsched DO Test Results.................................50
XI


CHAPTER 1
INTRODUCTION
The Force is a parallel programming language designed for large-scale shared-
memory multiprocessors. It is implemented as a two-layer macro preprocessor
extending the Fortran language and includes constructs that support both fine and
coarse grained parallelism. The Force language is based on the global parallelism
model. Under this model all the processes execute a single program completely and
in parallel, thus the parallelism in a program is present from the beginning of the
program. The number of processes is determined at run time, and remains fixed for
the entire duration of the program execution. Other than the processes requested by
the Force, no additional processes are allocated during the execution of the program,
nor are any processes relinquished to, or allocated from, a pool of free processes.
Implementing the language as a two-layer macro processor allows machine-
dependencies to be hidden from the user, and eases the porting of the language from
one platform to the next. The machine-dependencies are embedded in low-level
macros, which are in turn used to build the machine-independent high-level language
constructs. Porting the language to a new platform involves the conversion of the
low-level macros to use the machine-specific parallel extensions, and the writing of a
Force driver that creates and initializes the Force parallel environment.
Functional decomposition is preferred in cases where multiple parallel
program segments can be executed in parallel with respect to each other, but the
amount of parallelism does not require the complete force of processes. This
decomposition is especially desirable in cases where a parallel segment is very
sequential, and if executed by the entire force of processes would result in processes
being idle. The Resolve construct has been presented in several earlier papers [1][2],
1


but has not been completely integrated into any previous Force implementations. The
Silicon Graphics implementation of the Force language marks the first released
version of the Force language to include the Resolve construct.
This paper begins with a review of the Force language in Chapter 2, including
the key concepts of the language and an overview of the implementation structure.
Chapter 3 presents a brief overview of the Silicon Graphics Power Challenge and
0rigin2000 multiprocessors, a description of the steps needed to port the language to
a new platform, and points out the major issues that had to be resolved in the Silicon
Graphics implementation. Chapter 4 further discusses the benefits of allowing the
parallelism within a Force program to be divided along function, and explains how
the Resolve construct has been implemented in the Silicon Graphics' Force
implementation.
2


CHAPTER 2
THE FORCE PARALLEL PROGRAMMING
LANGUAGE
The Force is a parallel programming language based on the shared memory
multiprocessor model of computation. In this model a single program is being
executed by multiple processes, with each process having its own program counter.
This methodology is known as global parallelism or single program multiple data.
Force programs are written for an arbitrary number of identical processes, with the
number of processes being determined at run time.
The Force language is implemented as a two-level macro preprocessor, and
relies on the constructs provided by the multiprocessor for process creation,
synchronization, and shared memory allocations. This method of implementing the
language has the advantage of hiding machine-dependencies from the user, and
makes Force programs portable across all platforms on which the Force language has
been implemented. The programmer is insulated from process management, and is
left free to concentrate on the synchronization issues of parallel programming.
Force Language Concepts
The Force language is defined by a set of parallel programming constructs,
each of which embodies one of the language concepts. The language concepts can be
divided into four categories: program structure, declaration of variables, parallel
execution, and synchronization.
3


Program Stmcture
A Force program is comprised of both regular Fortran statements and the
Force language constructs. The program will have a single main program from which
both standard Fortran subroutines and Force multiprocess subroutines can be called.
When a Force subroutine is called, all the processes of the Force will jump to and
execute the parallel subroutine. This allows the parallel subroutines to contain any of
the parallel constructs of the Force language. Standard Fortran routines are executed
independently by each process, limiting them to using only sequential constructs.
The main program will include constructs for distributing the work across the
processes, and for synchronizing both program control and data.
Force of <# of procs> ident

[Externf ]
End declarations

Join
Forcecall ([parameters])
Forcesub ([parameters]) of <# of procs> ident

[Externf ]
End declarations

RETURN
Figure 2.1 : Force Program Structure Constructs
The Force program structure constructs are shown in Figure 2.1. The Force
and Forcesub macros are used to define Force main programs and subroutines,
respectively. All Force modules have the parallel environment available to them,
including variables that contain the number of Force processes, synchronization
4


variables, and unique process identifiers. The Join macro terminates the parallel
main program, with a Force parallel subroutine being simply terminated by a Fortran
Return statement. Force subroutines are called using the For cecal 1 macro.
Forcecall differs from the regular Fortran Call only in that provisions are made
to automatically pass the parallel environment for each process.
Declaration of Variables
In standard Fortran variables are declared as being either local or common.
The scope of a local variable is restricted to the routine in which it is declared. A
common variable has its scope expanded to all routines that declare the common
region. In Force, local and common specify the scope of a variable relative to a
single process. To define scope across processes, Force requires variables to also be
declared as either private or shared. Private and shared is orthogonal to local and
common. A variable classified as private is restricted to a single process; while
shared variables are shared across all the processes of the Force. Private variables
have separate instantiations for each process of the Force. Shared variables have only
a single instantiation and are accessible by all processes of the Force. Private and
shared variables will inherit from Fortran the storage class of common among
program modules or local to one module. Each variable class in Force supports all the
standard Fortran variable types.
Private
Private Common /