1.1. Getting Started with your own Self Organizing Cluster¶
CSinParallel distributes a Raspberry Pi image that enables a group of Raspberry Pis to self-organize into a compute cluster. It contains the code examples for this module. You can download a compressed file of this image from here. This should fit on an 8 GB or greater sim card.
You will need to uncompress this image file and use some software like Belena etcher on a laptop or desktop computer with a card slot or adapter to write the image to the microSD card.
Note
The Appendix contains a series of older videos showing the set up we used during the Summer 2021 workshop. Please note that these videos are outdated. If you want an example of how to assemble a cluster, we recommend that you visit the following blog post.
1.1.1. A summary of important commands¶
After initial set up, here is a summary of important commands that you need to run in order to get your cluster up and running every day:
Starting up the cluster:
First, log in to the head node using VNC viewer or equivalent (password is the default Pi password). In a terminal window type:
ssh pi@172.27.0.254
Run the following command to configure the head node as the head node of the cluster:
sudo head-node
Log into
hd-cluster
account (Notice the use ofsudo
before thesu
command):sudo su - hd-cluster
Type
pwd
to ensure that you are thehd-cluster
user (should see output like/home/hd-cluster/
):pwd
Direct the head node to auto-discover all the worker nodes:
soc-mpisetup
The worker nodes that are discovered will be placed in a hostifle called hostflie
that is located in the root directory of the hd-cluster
account.
Lastly, to test out your setup, run the following series of commands:
cd CSinParallel/Patternlets/mpi4py python run.py ./00spmd.py 4
Note
PLEASE BE SURE TO SHUT DOWN YOUR CLUSTER USING THE COMMANDS BELOW. DO NOT JUST POWER OFF THE CLUSTER! If you power off the cluster without following the shutdown procedure outlined below, your cluster will likely enter an inconsistent state. The procedure below resets the cluster so that way you can easily start up the cluster whenever you want.
Shutting down the cluster:
You will likely want to either start with a fresh copy of the code examples or make certain that the next person can start with a fresh copy. If you don’t intend to come back and work again another time, while still the hd-cluster user, type in the terminal window:
reset-code
You will then always do the next steps.
In the terminal window that you have open on the head node as the hd-cluster user:
Type
exit
to exit out of thehd-cluster
account:exit
Type
pwd
to confirm that you are now thepi
user (should see output like/home/pi
):pwd
Reset the head node to be a regular node:
sudo worker-node
Now you can power off each node.
Following the above steps fully when you start up and finish using the self-organizing Raspberry Pi cluster will ensure that subsequent uses will be error free!
1.1.2. Using this guide with your own cluster image¶
You can use this guide if you have your own Raspberry Pi cluster, or any system that has multiple cores and/or nodes. If you are planning on using your own cluster make sure have the following software requirements:
mpich or openmpi library
(optional for more advanced examples) python numpy package and Python 3.6 or higher
get the code examples by cloning this repository: https://github.com/csinparallel/Pi-Files.git
The Raspberry Pi cluster image provided by the CSinParallel group already has everything you need installed, including the code examples that we describe in detail in the following chapters.
1.1.3. What are Patterns?¶
Patterns in software are common implementations that have been used over and over by practitioners to accomplish tasks. As practitioners use them repeatedly, the community begins to give them names and catalog them, often turning them into reusable library functions. The examples you will see in this book are based on documented patterns that have been used to solve different problems using message passing between processes. Message passing is one form of distributed computing using processes, which can be used on clusters of computers or multicore machines.
In many of these examples, the pattern’s name is part of the python code file’s name. You will also see that often the MPI library functions also take on the name of the pattern, and the implementation of those functions themselves contains the pattern that practitioners found themselves using often. These pattern code examples we show you here, dubbed patternlets, are based on original work by Joel Adams (4).
1.1.4. What are Exemplars?¶
Exemplars are code examples that we and others have found to be accessible to people learning PDC programming. In this book we describe two exemplars and how to run them after we take you through the patternlets. There are more on the Raspberry Pi image that you could also explore on your own.
1.1.5. References¶
- 1
Dalcin, P. Kler, R. Paz, and A. Cosimo, Parallel Distributed Computing using Python, Advances in Water Resources, 34(9):1124-1139, 2011. http://dx.doi.org/10.1016/j.advwatres.2011.04.013
- 2
Dalcin, R. Paz, M. Storti, and J. D’Elia, MPI for Python: performance improvements and MPI-2 extensions, Journal of Parallel and Distributed Computing, 68(5):655-662, 2008. http://dx.doi.org/10.1016/j.jpdc.2007.09.005
- 3
Dalcin, R. Paz, and M. Storti, MPI for Python, Journal of Parallel and Distributed Computing, 65(9):1108-1115, 2005. http://dx.doi.org/10.1016/j.jpdc.2005.03.010
- 4
Adams, Joel C. “Patternlets: A Teaching Tool for Introducing Students to Parallel Design Patterns.” 2015 IEEE International Parallel and Distributed Processing Symposium Workshop. IEEE, 2015.