It's now the 13th of June, and alas, I still don't have internet access at home. I could spend hours judging the people at Comcast, but instead, I'll simply state that I hate them and wish horibbly itchy diseases on their children.

That aside, I'm still making progress. My main hangup has been the lack of similarity in the machines I'm using for my cluster. The problem then, is that one machine will finish its work faster than the other, and sit idle waiting for the other machines to catch up, so they can pass information between processes at the same stage in development. Solutions?

There's a couple. I guess I should start with a brief overview of clustering.

Beowulf Clustering:

Beowulf clustering is simple. You create a process on a machine, and it does lots of work, while communicating that work back to a root machine.

The Problem

My plan is to give each machine (minus the root) large amounts of data to process in the same way. With machines with different processor speeds, memory sizes, and bus speeds, the same amount of work can take very different amounts of time. Since the machines will work in cycles, all machines must finish one cycle before the next can start. Thus, the slowest machine will hold back the group (something one is usually trying to avoid in parallel processing).

Mosix Clustering:

A brilliant plan, though poorly implemented; Mosix clustering is what every programmer wants when parallel programming. Using the standard unix system calls, a programmer can fork() a process in two, and the two processes will migrate to the machine that the OS determines has the least load or can accomplish the task in the least amount of time.

How this solves my problem:

When a machine finishes its assigned work and is sitting idle, the operating system would then automatically send it work, decreasing the work left to be done on other nodes, and decreasing the amount of time necessary to finish all distributed work.

Problems with Mosix:
Mosix has yet to be ported to the 2.6 linux kernel. There's a beta version out there, but as far as I can tell, its not gonna boot. This means a trip down memory lane to Fedora 1, a stable, but obviously older, OS.

Obviously, while Mosix clustering is the ideal solution to my problem, it's not very feasible.

Next time, more infeasible solutions, and possible a feasible one.

Advertisements