4.2. Message Passing Patterns: avoiding deadlock¶
Let’s look at a few more correct message passing examples.
4.2.1. Fix the deadlock¶
To fix deadlock of the previous example, we coordinate the communication between pairs of processes so that there is an ordering of sends and receives between them.
Note
The new code corrects deadlock with a simple change: odd process sends first, even process receives first. This is the proper pattern for exchanging data between pairs of processes.
from mpi4py import MPI
# function to return whether a number of a process is odd or even
def odd(number):
if (number % 2) == 0:
return False
else :
return True
def main():
comm = MPI.COMM_WORLD
id = comm.Get_rank() #number of the process running the code
numProcesses = comm.Get_size() #total number of processes running
myHostName = MPI.Get_processor_name() #machine name running the code
if numProcesses > 1 and not odd(numProcesses):
sendValue = id
if odd(id):
#odd processes send to their paired 'neighbor', then receive from
comm.send(sendValue, dest=id-1)
receivedValue = comm.recv(source=id-1)
else :
#even processes receive from their paired 'neighbor', then send
receivedValue = comm.recv(source=id+1)
comm.send(sendValue, dest=id+1)
print("Process {} of {} on {} computed {} and received {}"\
.format(id, numProcesses, myHostName, sendValue, receivedValue))
else :
if id == 0:
print("Please run this program with the number of processes \
positive and even")
########## Run the main function
main()
Program file: 05messagePassing.py
Example usage:
python run.py ./05messagePassing.py N
Here the N signifies the number of processes to start up in mpi.
run.py executes this program within mpirun using the number of processes given.
Exercise:
Run, using N = 4, 6, 8, and 10 processes. (Note what happens if you use an odd number.)
4.2.2. Sending data structures¶
This next example illustrates that we can exchange different lists of data between processes.
Program file: 06messagePassing2.py
Example usage:
python run.py ./06messagePassing2.py N
Here the N signifies the number of processes to start up in mpi.
run.py executes this program within mpirun using the number of processes given.
Exercise:
Run, using N = 4, 6, 8, and 10 processes.
4.2.2.1. Explore the code¶
In the following code, locate where the list of elements to be sent is being made by each process.
from mpi4py import MPI
# function to return whether a number of a process is odd or even
def odd(number):
if (number % 2) == 0:
return False
else :
return True
def main():
comm = MPI.COMM_WORLD
id = comm.Get_rank() #number of the process running the code
numProcesses = comm.Get_size() #total number of processes running
myHostName = MPI.Get_processor_name() #machine name running the code
if numProcesses > 1 and not odd(numProcesses):
#generate a list of 8 numbers, beginning with my id
sendList = list(range(id, id+8))
if odd(id):
#odd processes send to their 'left neighbor', then receive from
comm.send(sendList, dest=id-1)
receivedList = comm.recv(source=id-1)
else :
#even processes receive from their 'right neighbor', then send
receivedList = comm.recv(source=id+1)
comm.send(sendList, dest=id+1)
print("Process {} of {} on {} computed {} and received {}"\
.format(id, numProcesses, myHostName, sendList, receivedList))
else :
if id == 0:
print("Please run this program with the number of processes \
positive and even")
########## Run the main function
main()
4.2.3. Ring of passed messages¶
Another pattern that appears in message passing programs is to use a ring of processes: messages get sent in this fashion:
When we have 4 processes, the idea is that process 0 will send data to process 1, who will receive it from process 0 and then send it to process 2, who will receive it from process 1 and then send it to process 3, who will receive it from process 2 and then send it back around to process 0.
Program file: 07messagePassing5.py
Example usage:
python run.py ./07messagePassing3.py N
Here the N signifies the number of processes to start up in mpi.
run.py executes this program within mpirun using the number of processes given.
Exercise:
Run, using N = from 1 through 8 processes. What happens when you run it with N = 1?
You should see an error message. At least two processes are needed to enable the ring of passed messages!
4.2.3.1. Explore the code¶
Compare the results from running the example to the code below. Make sure that you can trace how the code generates the output that you see.
from mpi4py import MPI
def main():
comm = MPI.COMM_WORLD
id = comm.Get_rank() #number of the process running the code
numProcesses = comm.Get_size() #total number of processes running
myHostName = MPI.Get_processor_name() #machine name running the code
if numProcesses > 1 :
if id == 0: # conductor
#generate a list with conductor id in it
sendList = [id]
# send to the first worker
comm.send(sendList, dest=id+1)
print("Conductor Process {} of {} on {} sent {}"\
.format(id, numProcesses, myHostName, sendList))
# receive from the last worker
receivedList = comm.recv(source=numProcesses-1)
print("Conductor Process {} of {} on {} received {}"\
.format(id, numProcesses, myHostName, receivedList))
else :
# worker: receive from any source
receivedList = comm.recv(source=id-1)
# add this worker's id to the list and send along to next worker,
# or send to the conductor if the last worker
sendList = receivedList + [id]
comm.send(sendList, dest=(id+1) % numProcesses)
print("Worker Process {} of {} on {} received {} and sent {}"\
.format(id, numProcesses, myHostName, receivedList, sendList))
else :
print("Please run this program with the number of processes \
greater than 1")
########## Run the main function
main()
- The last process with the highest id will have 0 as its destination because of the modulo (%) by the number of processes.
- Correct! Note that you must code this yourself.
- The last process sends to process 0 by default.
- Processes can send to any other process, including the highest numbered one.
- A destination cannot be higher than the highest process.
- This is technically true, but it is important to see how the code ensures this.
Q-1: How is the finishing of the ‘ring’ completed, where the last process determines that it should send back to process 0?