So you want to build a supercomputer? Well, if you checked out my video on quantum computing you’ll know that scientist are working to make computers stronger and faster than ever. Unfortunately, building a quantum computer is not something you can do at home, unless your home is a laboratory of course. All hope isn’t lost though, you can still go on to build your very own supercomputer. To do this, we will be making a Linux cluster. A cluster is a set of loosely or tightly connected computers that work together so that they can be viewed as a single system.
For this Linux cluster, we will be using the program called MPICH. MPICH is an implmentation of the Message Passing Interface (MPI) standard. This system is widely used in parallel computing in order to pass information to different worker nodes on the network. One thing to note for this tutorial, I will not be making use of a shared file system. However, using a shared file system will make your life a lot easier. Without using a shared file system (as I will show below), you will need to have the file paths and usernames the same on both system. Now, let’s take a look at how to build a supercomputer.
What You Will Need:
- At least two devices running Linux
- MPICH
I am going to assume that you are running a debian based distro
Step 1: Download Prerequisites
Before we get started, you will need to make sure that you have the GCC, G++, Openssh and Fortran compiler installed. Type in the following:
Step 2: Create New User
Next, you need to create a new user and add them to the sudo group. Keep in mind, the username must be exactly the same on both systems (this is where using a shared file system would save some time).
This will create a new user called Dave and add him to the sudo group. Remember, repeat this process on all the other computers that you are planning to use in the final super computer (use the same username). Now log out and back in to that user.
Step 3: Edit Hosts File
To make life a little easier, we are going to give names to all the computers on the network instead of just referring to them by their IP address. To do this, you must edit the host file. I am first going to install the nano text editor
Make sure that your hosts file looks similar to mine. Of course, replace the IP addresses (and names if you desire) to the appropriate values matching your network. Hit CRTL +O to save and CRTL + X to exit.
To test is type in the following:
Of course, replace worker0 with whatever you called the other computer. It should be able to communicate with the other computer without failure.
Step 4: Configure MPICH Program
Download the mpich.tar.gz file for your system. Unzip it using whatever method you like (I prefer just to use the gui).
Make a directory in your home folder called mpich.
Then navigate to the mpich.zip that you just extracted and run the configure file.
This will configure the files and put them in the mpich folder that you made in your home directory. Now, make the program:
Finally, copy the examples folder (located inside the mpich.your.version.tar.gz that you extracted) and moved that to the MPICH folder inside of your home directory. I am going to use the GUI (explorer) rather than doing it from terminal.
**Remember, all of these steps must be repeated on all Systems on the network**
Step 5: Bashrc
Next, you need to export that paths to your .bashrc file.In your home directory type in the following:
At the bottom of the file, add in the following lines:
Then, type CRTL + O to save and CRTL + X to exit.
To test that it works type in:
You should see the folder path. Do it again with mpiexec
Step 6: Processor file
We need to have a file that specifies to MPICH the computers on the network and the number of processes that we want them to handle. Navigate to the MPICH directory in your home folder. Create a file (I’ll call it hosts) and within that file identify the computer name with the number of processes.
Then, type CRTL + O to save and CRTL + X to exit.
Step 7: Password-less SSH
The final thing that we need to do is make sure that you can connect via ssh to the other computer(s) without needing a password. Let’s generate the ssh key.
Keep all the following values default and don’t specify a pass phrase. Finally, type in:
Replace “worker0” with the appropriate computer on your network. This will copy the ssh-key to that computer. It might prompt you to log in using the password for the first-time. In any case, try to ssh into that computer and make sure that you don’t need to enter a password.
If it works, type “exit” to exit the ssh.
Step 8: Testing
Now it’s time for us to test the super computer. Just to summarize, the above steps were all so that we can set up all the computers to effectively communicate to each other on the network using the MPICH interface. Navigate to the Examples folder inside of the Mpich directory in your home folder and run the pi example program. Within the run command, you can specify the number of processes that you want to use:
If everything works, you should see a similar output.
As you can see, the computer splits up the tasks to each individual computer on the network. Now keep in mind, this is just a simple program that calculates the absolute value of pi. Imagine if we developed a more complex program with many devices on the network, this would be extraordinary!