9.1. Past Experience using clusters in a workshop¶
This chapter describes how we used a previous Raspberry Pi image for a remote workshop.
CSinParallel distributes a Raspberry Pi image that enables a group of Raspberry Pis to self-organize into a compute cluster. Participants of the 2021 CSinParallel Virtual Summer Workshop received Raspberry Pi Cluster Kits in the mail. The following videos illustrate how to assemble the cluster kit:
The video below gives a quick summary of what is mailed to participants:
This next two videos illustrate how to assemble the Cluster Kit and verify that it is operational. Please note that we assume that viewers have already set up the Raspberry Pi Kit “head node” that is used on day 1 of the workshop, and verified that they can connect to the head node.
9.1.1. A summary of important commands¶
After initial set up, here is a summary of important commands that you need to run in order to get your cluster up and running every day:
Starting up the cluster:
First, log in to the head node using VNC viewer or equivalent. In a terminal window:
Run the following command to configure the head node as the head node of the cluster:
sudo head-node
Log into
hd-cluster
account (Notice the use ofsudo
before thesu
command):sudo su - hd-cluster
Type
pwd
to ensure that you are thehd-cluster
user (should see output like/home/hd-cluster/
):pwd
Direct the head node to auto-discover all the worker nodes:
soc-mpisetup
The worker nodes that are discovered will be placed in a hostifle called hostflie
that is located in the root directory of the hd-cluster
account.
Lastly, to test out your setup, run the following series of commands:
cd CSinParallel/Patternlets/MPI/00.spmd/ make mpirun -hostfile ~/hostfile -np 4 ./spmd
Note
PLEASE BE SURE TO SHUT DOWN YOUR CLUSTER USING THE COMMANDS BELOW. DO NOT JUST POWER OFF THE CLUSTER! If you power off the cluster without following the shutdown procedure outlined below, your cluster will likely enter an inconsistent state. The procedure below resets the cluster so that way you can easily start up the cluster whenever you want.
Shutting down the cluster:
In the terminal window that you have open:
Type
exit
to exit out of thehd-cluster
account:exit
Type
pwd
to confirm that you are now thepi
user (should see output like/home/pi
):pwd
Shut down the worker nodes using the following command. Enter the default password as necessary (should only need to do this the first time around):
sudo shutdown-workers
Reset the head node to be a regular node:
sudo worker-node
Shut down the head node by using the following command:
sudo shutdown -h now
Following the above steps fully when you start up and finish using the self-organizing Raspberry Pi cluster will ensure that subsequent uses will be error free!
9.1.2. Using this guide with your own cluster image¶
You can use this guide if you have your own Raspberry Pi cluster, or any system that has multiple cores and/or nodes. If you are planning on using your own cluster make sure have the following software requirements:
mpich or openmpi library
(optional for more advanced examples) python numpy package and Python 3.6 or higher
The Raspberry Pi cluster image provided by the CSinParallel group already have everything you need installed, including the following code examples that we describe in detail in the following chapters.
9.1.3. What are Patterns?¶
Patterns in software are common implementations that have been used over and over by practitioners to accomplish tasks. As practitioners use them repeatedly, the community begins to give them names and catalog them, often turning them into reusable library functions. The examples you will see in this book are based on documented patterns that have been used to solve different problems using message passing between processes. Message passing is one form of distributed computing using processes, which can be used on clusters of computers or multicore machines.
In many of these examples, the pattern’s name is part of the python code file’s name. You will also see that often the MPI library functions also take on the name of the pattern, and the implementation of those functions themselves contains the pattern that practitioners found themselves using often. These pattern code examples we show you here, dubbed patternlets, are based on original work by Joel Adams (4).
9.1.4. References¶
- 1
Dalcin, P. Kler, R. Paz, and A. Cosimo, Parallel Distributed Computing using Python, Advances in Water Resources, 34(9):1124-1139, 2011. http://dx.doi.org/10.1016/j.advwatres.2011.04.013
- 2
Dalcin, R. Paz, M. Storti, and J. D’Elia, MPI for Python: performance improvements and MPI-2 extensions, Journal of Parallel and Distributed Computing, 68(5):655-662, 2008. http://dx.doi.org/10.1016/j.jpdc.2007.09.005
- 3
Dalcin, R. Paz, and M. Storti, MPI for Python, Journal of Parallel and Distributed Computing, 65(9):1108-1115, 2005. http://dx.doi.org/10.1016/j.jpdc.2005.03.010
- 4
Adams, Joel C. “Patternlets: A Teaching Tool for Introducing Students to Parallel Design Patterns.” 2015 IEEE International Parallel and Distributed Processing Symposium Workshop. IEEE, 2015.