Parallel programming model explained

In computing, a parallel programming model is an abstraction of parallel computer architecture, with which it is convenient to express algorithms and their composition in programs. The value of a programming model can be judged on its generality: how well a range of different problems can be expressed for a variety of different architectures, and its performance: how efficiently the compiled programs can execute.[1] The implementation of a parallel programming model can take the form of a library invoked from a programming language, as an extension to an existing languages.

Consensus around a particular programming model is important because it leads to different parallel computers being built with support for the model, thereby facilitating portability of software. In this sense, programming models are referred to as bridging between hardware and software.[2]

Classification of parallel programming models

Classifications of parallel programming models can be divided broadly into two areas: process interaction and problem decomposition.[3] [4] [5]

Process interaction

Process interaction relates to the mechanisms by which parallel processes are able to communicate with each other. The most common forms of interaction are shared memory and message passing, but interaction can also be implicit (invisible to the programmer).

Shared memory

See main article: Shared memory (interprocess communication). Shared memory is an efficient means of passing data between processes. In a shared-memory model, parallel processes share a global address space that they read and write to asynchronously. Asynchronous concurrent access can lead to race conditions, and mechanisms such as locks, semaphores and monitors can be used to avoid these. Conventional multi-core processors directly support shared memory, which many parallel programming languages and libraries, such as Cilk, OpenMP and Threading Building Blocks, are designed to exploit.

Message passing

See main article: Message passing. In a message-passing model, parallel processes exchange data through passing messages to one another. These communications can be asynchronous, where a message can be sent before the receiver is ready, or synchronous, where the receiver must be ready. The Communicating sequential processes (CSP) formalisation of message passing uses synchronous communication channels to connect processes, and led to important languages such as Occam, Limbo and Go. In contrast, the actor model uses asynchronous message passing and has been employed in the design of languages such as D, Scala and SALSA.

Partitioned global address space

See main article: Partitioned global address space. Partitioned Global Address Space (PGAS) models provide a middle ground between shared memory and message passing. PGAS provides a global memory address space abstraction that is logically partitioned, where a portion is local to each process. Parallel processes communicate by asynchronously performing operations (e.g. reads and writes) on the global address space, in a manner reminiscent of shared memory models. However by semantically partitioning the global address space into portions with affinity to a particular processes, they allow programmers to exploit locality of reference and enable efficient implementation on distributed memory parallel computers. PGAS is offered by many parallel programming languages and libraries, such as Fortran 2008, Chapel, UPC++, and SHMEM.

Implicit interaction

See main article: Implicit parallelism. In an implicit model, no process interaction is visible to the programmer and instead the compiler and/or runtime is responsible for performing it. Two examples of implicit parallelism are with domain-specific languages where the concurrency within high-level operations is prescribed, and with functional programming languages because the absence of side-effects allows non-dependent functions to be executed in parallel.[6] However, this kind of parallelism is difficult to manage[7] and functional languages such as Concurrent Haskell and Concurrent ML provide features to manage parallelism explicitly and correctly.

Problem decomposition

A parallel program is composed of simultaneously executing processes. Problem decomposition relates to the way in which the constituent processes are formulated.[8]

Task parallelism

See main article: Task parallelism. A task-parallel model focuses on processes, or threads of execution. These processes will often be behaviourally distinct, which emphasises the need for communication. Task parallelism is a natural way to express message-passing communication. In Flynn's taxonomy, task parallelism is usually classified as MIMD/MPMD or MISD.

Data parallelism

See main article: Data parallelism. A data-parallel model focuses on performing operations on a data set, typically a regularly structured array. A set of tasks will operate on this data, but independently on disjoint partitions. In Flynn's taxonomy, data parallelism is usually classified as MIMD/SPMD or SIMD.

Stream Parallelism

Stream parallelism, also known as pipeline parallelism, focuses on dividing a computationinto a sequence of stages, where each stage processes a portion of the inputdata. Each stage operates independently and concurrently, and the output of onestage serves as the input to the next stage. Stream parallelism is particularly suitablefor applications with continuous data streams or pipelined computations.

Implicit parallelism

See main article: Implicit parallelism. As with implicit process interaction, an implicit model of parallelism reveals nothing to the programmer as the compiler, the runtime or the hardware is responsible. For example, in compilers, automatic parallelization is the process of converting sequential code into parallel code, and in computer architecture, superscalar execution is a mechanism whereby instruction-level parallelism is exploited to perform operations in parallel.

Terminology

Parallel programming models are closely related to models of computation. A model of parallel computation is an abstraction used to analyze the cost of computational processes, but it does not necessarily need to be practical, in that it can be implemented efficiently in hardware and/or software. A programming model, in contrast, does specifically imply the practical considerations of hardware and software implementation.[9]

A parallel programming language may be based on one or a combination of programming models. For example, High Performance Fortran is based on shared-memory interactions and data-parallel problem decomposition, and Go provides mechanism for shared-memory and message-passing interaction.

Example parallel programming models

Name Class of interaction Class of decomposition Example implementations
Actor modelAsynchronous message passingTaskD, Erlang, Scala, SALSA
Bulk synchronous parallelShared memoryTaskApache Giraph, Apache Hama, BSPlib
Communicating sequential processesSynchronous message passingTaskAda, Occam, VerilogCSP, Go
CircuitsMessage passingTaskVerilog, VHDL
DataflowMessage passingTaskLustre, TensorFlow, Apache Flink
FunctionalMessage passingTaskConcurrent Haskell, Concurrent ML
LogP machineSynchronous message passingNot specifiedNone
Parallel random access machineShared memoryDataCilk, CUDA, OpenMP, Threading Building Blocks, XMTC
SPMD PGASPartitioned global address spaceDataFortran 2008, Unified Parallel C, UPC++, SHMEM
Global-view Task parallelismPartitioned global address spaceTaskChapel, X10

See also

Further reading

Notes and References

  1. Skillicorn, David B., "Models for practical parallel computation", International Journal of Parallel Programming, 20.2 133–158 (1991), https://www.ida.liu.se/~chrke55/papers/modelsurvey.pdf
  2. Leslie G. Valiant, "A bridging model for parallel computation", Communications of the ACM, Volume 33, Issue 8, August, 1990, pages 103–111.
  3. John E. Savage, Models of Computation: Exploring the Power of Computing, 2008, Chapter 7 (Parallel Computation), http://cs.brown.edu/~jes/book/
  4. Web site: 1.3 A Parallel Programming Model . 2024-03-21 . www.mcs.anl.gov.
  5. Web site: Introduction to Parallel Computing Tutorial HPC @ LLNL . 2024-03-21 . hpc.llnl.gov.
  6. Hammond, Kevin. Parallel functional programming: An introduction. In International Symposium on Parallel Symbolic Computation, p. 46. 1994.
  7. McBurney, D. L., and M. Ronan Sleep. "Transputer-based experiments with the ZAPP architecture." PARLE Parallel Architectures and Languages Europe. Springer Berlin Heidelberg, 1987.
  8. Web site: 2.2 Partitioning . 2024-03-21 . www.mcs.anl.gov.
  9. Skillicorn, David B., and Domenico Talia, Models and languages for parallel computation, ACM Computing Surveys, 30.2 123–169 (1998), https://www.cs.utexas.edu/users/browne/CS392Cf2000/papers/ModelsOfParallelComputation-Skillicorn.pdf