1.2. Overview of Multiprocessors and Message Passing¶
Now that your Raspberry Pi Self Organizing Cluster is set up, let’s discuss some important terms. A single system that uses two or more central processing units (CPUs) is considered multiprocessor. A single Raspberry Pi, which contains a quad-core CPU is not considered a multiprocessor since it has only one CPU. However, a cluster, like our Raspberry Pi cluster, where the set of nodes are networked together in such a way that they act as a unified system, is an example of multiprocessor.
A distributed memory system has separate banks of memory that are local to a specific node, or a subset of cores on that node. Communication in a distributed memory system occurs through the use of message passing, in which messages are sent across the network using processes. Recall that a process is an abstraction of a running program, where each process contains a copy of the code and program executable, and its own (private) allocation of memory. Message passing can be used on a single multicore computer or with a cluster of computers. The illustration below is a schematic representation of a 3-node cluster (like your Raspberry Pi cluster). Note that each node has its own separate bank of memory, and how the nodes are connected to each other via an interconnect (in our case, a switch). Thus, our cluster is an example of a distributed memory system.
The Message Passing Interface (MPI) is a standard library for passing messages between processes in a distributed memory model. MPI is a programming model that is widely used for programming distributed memory systems. In the context of a cluster, work performed on the head node is responsible for dividing up a task and distributing it to a set of worker nodes (or workers), who are responsible for performing some subset of computations before sending the results back to the head. In this manner, each node can work on its set of computations independently and simultaneously. MPI provides a common interface for programmers to easily create applications to allow computer systems to communicate in this manner.
1.2.1. mpi4py¶
The examples in this guide use a package called mpi4py . The mpi4py package provide Python functions that have a direct mapping to underlying MPI C library functions. Students who may have learned programming using Python will find mpi4py a potentially easier way to grasp message passing fundementals than using the standard C or Fortran implementations of MPI.
1.2.2. Check your knowledge¶
-
Q-1: How well do you understand common terms?
Nice try! Re-read the description above and try again!
- cluster
- a set of nodes networked in a way that it appears as a unified system
- distributed memory system
- a computer system with localized banks of memory
- message passing
- a communication technique where pairs of processes pass messages to each other
- multiprocessor
- a computer system with two or more CPUs
- MPI
- a standard API for passing messages between processes
1.2.3. Further Reading¶
A more detailed discussion of message passing can be found in Chapter 15.2 of Dive into Systems, or Chapter 2 of PDC for Beginners.