NOTE: If you find yourself angry after reading this post, read this first.

This seems difficult, at first glance, but really, it’s not.

At all.

From the time you get all your hardware plugged in to the time you’re doing some massive parallel processing, depending on your needs, can be anywhere from 2 hours to 10 minutes. And this simple guide will help you get there.

Get the Hardware

Now, mind you, I’m not trying to do this as cheaply as possible, but I am trying to do this with as much bang for your buck as possible. These are the things you need to get.

PCs: Duh, kinda the barebones necessity in a cluster, and I have a recommendation: eMachines T5212. It’s got a Pentium 805 Dual core Processor with each core running at 2.66ghz, for a total of 5.32ghz per machine, and 2x1MB L2 cache, which while not stellar, is pretty respectable. It’s also got 1gb of RAM and a 200gb hard-drive, so storage problems go away pretty quickly too. There’s a lower model with half the ram and a smaller hard drive, the T5216, but I need the RAM, so I go with the T5212. At Best Buy and other stores, these run about 534.99 for just the tower. Mind you I have chosen this box for the hardware’s compatibility to the software we’ll be using in a later step.

Network Cables: You’re going to need at least one for each PC, and probably a couple more if you have an external device or PC acting as your DHCP server and/or gateway.

Network Hardware: You’re going to need a switch big enough for all your PCs to connect (or a series of small ones that you can daisy chain together). Life will also be a lot easier if you have a ONE DHCP server for all of your machines. All the machines need to be on the same IP subnet, but don’t need to be on the same network switch or in the same geographic area.

Setting Up your Hardware:

In my personal configuration, I have a small network appliance that acts as a dhcp server, router, and print server, so I use that as the base of my networking needs. I then have a series of smaller switches which have 1 (count them, 1) link total back to the DHCP server. This is important for me, so that network traffic on the cluster doesn’t bog down the rest of my home network. How else could I play Halo while factoring 100 digit non-prime numbers? This will also help your cluster have fewer jumps between nodes.

Your Software:

I strongly recommend the use of ClusterKnoppix. It’s a great tool, and is very stable. It uses the 2.4 debian kernel, and has openMosix installed and configured for auto-detect (which means nodes are essentially plug-and-play, though not really, and I’ll discuss why later). You’ll need one copy for each box, unless you choose to commit the knoppix image to the hard-drive of each machine. It’s not necessary, but it may be easier if you don’t have a stack of CDR’s at your disposal.

Booting up the Cluster:

This is probably the easiest part of the process. Place a ClusterKnoppix CD in each box, and boot it up. I can vouch that this hardware is compatible and you won’t have any issues loading, so now you’re ready to work! NOTE: If you do this with other hardware, I can’t guarantee things are going to work so swimmingly, and I am nowhere near qualified to help you trouble-shoot your hardware. If you have a DHCP server and DNS somewhere on your network, your cluster should be live to the internet, so you can pick up your code off other boxes on the network or from a CVS server somewhere out there in the intarwebs. There is a version of GCC and G++, though I can’t think of the version numbers off the top of my head (feel free to check the link on the side of this blog, I’m sure its there somewhere).

Making your Mosix Cluster a Beowulf

There’s two methods, but they essentially do the same thing. The first is to commit a knoppix image to the hard-drive of one of your boxes. There are tools for formatting and repartitioning the harddrive in the utilities menu in KDE (I think it’s GTPart that’s installed, as well as a few others). Follow the instructions from http://www.knoppix.org or from any other live-distribution site (they’re a little extensive, else I would include them here). The second is to commit an alteration image to your hard disk, and then boot knoppix from this alteration image (a new feature to Knoppix that I’ve never used, so once again, check the intarwebs).

Either method you choose, I would strongly recommend you use LAM/MPI (or openMPI if you so strongly desire the biggest and baddest). The nice thing about this setup, is that you do not need to configure each machine with lam, or configure your root node with machine lists of all the other nodes in the network. All you have to do is create multiple processes on the node that has MPI installed, and openMosix will balance the cluster. It’s truly beautiful. In order to run a process in mpi, follow these simple instructions (for lam):

bash$: lamboot

<some output here>

bash$: mpirun -np (some number of processes) <your executable name> <your arguments>

That’s it. You don’t even need to compile your binaries using the MPI compilers, assuming they don’t use the MPI libraries. If they do, use mpic++ or mpicc as you would g++ or gcc, respectively.

I’d love to hear success stories, so please, leave comments!