现在的位置: 首页 > 综合 > 正文

(Part 2–Chapter 5-7) High Performance Linux Clusters with OSCAR, Rocks, OpenMosix, and MPI

2012年11月09日 ⁄ 综合 ⁄ 共 38441字 ⁄ 字号 评论关闭

Part II: Getting Started Quickly

This section describes the installation of three software packages that, when installed, will provide you with a complete working cluster. These packages differ radically. openMosix provides Linux kernel extensions that transparently move processes among machines to balance loads and optimize performance. While a truly remarkable package, it is not what people typically think about when they hear the word "cluster." OSCAR and Rocks are collections of software packages that can be installed at once, providing a more traditional Beowulf-style cluster. Whichever way you decide to go, you will be up and running in short order.

Chapter 5. openMosix

openMosix is software that extends the Linux kernel so that processes can migrate transparently among the different machines within a cluster in order to more evenly distribute the workload. This chapter gives the basics of setting up and using an openMosix cluster. There is a lot more to openMosix than described here, but this should be enough to get you started and keep you running for a while unless you have some very special needs.

5.1 What Is openMosix?

Basically, the openMosix software includes both a set of kernel patches and support tools. The patches extend the kernel to provide support for moving processes among machines in the cluster. Typically, process migration is totally transparent to the user. However, by using the tools provided with openMosix, as well as third-party tools, you can control the migration of processes among machines.

Let's look at how openMosix might be used to speed up a set of computationally expensive tasks. Suppose, for example, you have a dozen files to compress using a CPU-intensive program on a machine that isn't part of an openMosix cluster. You could compress each file one at a time, waiting for one to finish before starting the next. Or you could run all the compressions simultaneously by starting each compression in a separate window or by running each compression in the background (ending each command line with an &). Of course, either way will take about the same amount of time and will load down your computer while the programs are running.

However, if your computer is part of an openMosix cluster, here's what will happen: First, you will start all of the processes running on your computer. With an openMosix cluster, after a few seconds, processes will start to migrate from your heavily loaded computer to other idle or less loaded computers in the clusters. (As explained later, because some jobs may finish quickly, it can be counterproductive to migrate too quickly.) If you have a dozen idle machines in the cluster, each compression should run on a different machine. Your machine will have only one compression running on it (along with a little added overhead) so you still may be able to use it. And the dozen compressions will take only a little longer than it would normally take to do a single compression.

If you don't have a dozen computers, or some of your computers are slower than others, or some are otherwise loaded, openMosix will move the jobs around as best it can to balance the load. Once the cluster is set up, this is all done transparently by the system. Normally, you just start your jobs. openMosix does the rest. On the other hand, if you want to control the migration of jobs from one computer to the next, openMosix supplies you with the tools to do just that.

(Currently, openMosix also includes a distributed filesystem. However, this is slated for removal in future releases. The new goal is to integrate support for a clustering filesystem such as Intermezzo.)

5.2 How openMosix Works

openMosix originated as a fork from the earlier MOSIX (Multicomputer Operating System for Unix) project. The openMosix project began when the licensing structure for MOSIX moved away from a General Public License. Today, it has evolved into a project in its own right. The original MOSIX project is still quite active under the direction of Amnon Barak (http://www.mosix.org). openMosix is the work of Moshe Bar, originally a member of the MOSIX team, and a number of volunteers. This book focuses on openMosix, but MOSIX is a viable alternative that can be downloaded at no cost.

As noted in Chapter 1, one approach to sharing a computation between processors in a single-enclosure computer with multiple CPUs is symmetric multiprocessor (SMP) computing. openMosix has been described, accurately, as turning a cluster of computers into a virtual SMP machine, with each node providing a CPU. openMosix is potentially much cheaper and scales much better than SMPs, but communication overhead is higher. (openMosix will work with both single-processor systems and SMP systems.) openMosix is an example of what is sometimes called single system image clustering (SSI) since each node in the cluster has a copy of a single operating system kernel.

The granularity for openMosix is the process. Individual programs, as in the compression example, may create the processes, or the processes may be the result of different forks from a single program. However, if you have a computationally intensive task that does everything in a single process (and even if multiple threads are used), then, since there is only one process, it can't be shared among processors. The best you can hope for is that it will migrate to the fastest available machine in the cluster.

Not all processes migrate. For example, if a process only lasts a few seconds (very roughly, less than 5 seconds depending on a number of factors), it will not have time to migrate. Currently, openMosix does not work with multiple processes using shared writable memory, such as web servers.[1] Similarly, processes doing direct manipulation of I/O devices won't migrate. And processes using real-time scheduling won't migrate. If a process has already migrated to another processor and attempts to do any these things, the process will migrate back to its unique home node (UHN), the node where the process was initially created, before continuing.

[1] Actually, the migration of shared memory (MigSHM) patch is an openMosix patch that implements shared memory migration. At the time this was written, it was not part of the main openMosix tree. (Visit http://mcaserta.com/maask/.)

To support process migration, openMosix divides processes into two parts or contexts. The user context contains the program code, stack, data, etc., and is the part that can migrate. The system context, which contains a description of the resources the process is attached to and the kernel stack, does not migrate but remains on the UHN.

openMosix uses an adaptive resource allocation policy. That is, each node monitors and compares its own load with the loads on a portion of the other computers within the cluster. When a computer finds a more lightly loaded computer (based on the overall capacity of the computer), it will attempt to migrate a process to the more lightly loaded computer, thereby creating a more balanced load between the two. As the loads on individual computers change, e.g., when jobs start or finish, processes will migrate among the computers to rebalance loads across the cluster, adapting dynamically to the changes in loads.

Individual nodes, acting as autonomous systems, decide which processes migrate. The communications among small sets of nodes within the cluster used to compare loads is randomized. Consequently, clusters scale well because of this random element. Since communications is within subsets in the cluster, nodes have limited but recent information about the state of the whole cluster. This approach reduces overhead and communication.

While load comparison and process migration are generally automatic within a cluster, openMosix provides tools to control migration. It is possible to alter the cluster's perception of how heavily an individual computer is loaded, to tie processes to a specific computer, or to block the migration of processes to a computer. However, precise control for the migration of a group of processes is not practical with openMosix at this time.[2]

[2] This issue is addressed by a patch that allows the creation of process groups, available at http://www.openmosixview.com/miggroup/.

The openMosix API uses the values in the flat files in /proc/hpc to record and control the state of the cluster. If you need information about the current configuration, want to do really low-level management, or write management scripts, you can look at or write to these files.

5.3 Selecting an Installation Approach

Since openMosix is a kernel extension, it won't work with just any kernel. At this time, you are limited to a relatively recent (at least version 2.4.17 or more recent) IA32-compatible Linux kernel. An IA64 port is also available. However, don't expect openMosix to be available for a new kernel the same day a new kernel is released. It takes time to develop patches for a kernel. Fortunately, your choice of Linux distributions is fairly broad. Among others, openMosix has been reported to work on Debian, Gentoo, Red Hat, and SuSe Linux. If you just want to play with it, you might consider Bootable Cluster CD (BCCD), Knoppix, or PlumpOS, three CD-bootable Linux distributions that include openMosix. You'll also need a reasonably fast network and a fair amount of swap space to run openMosix.

To build your openMosix cluster, you need to install an openMosix extended kernel on each of the nodes in the cluster. If you are using a suitable version of Linux and have no other special needs, you may be able to download a precompiled version of the kernel. This will significantly simplify setup. Otherwise, you'll need to obtain a clean copy of the kernel sources, apply the openMosix patches to the kernel source code, recompile the sources, and install the patched kernel. This isn't as difficult as it might sound, but it is certainly more involved than just installing a precompiled kernel. Recompiling the kernel is described in detail later in this chapter. We'll start with precompiled kernels.

While using a precompiled kernel is the easiest way to go, it has a few limitations. The documentation is a little weak with the precompiled kernels, so you won't know exactly what options have been compiled into the kernel without doing some digging. (However, the .config files are available via CVS and the options seem to be reasonable.) If you already have special needs that required recompiling your kernel, e.g., nonstandard hardware, don't expect those needs to go away.

You'll need to use the same version of the patched kernel on all your systems, so choose accordingly. This doesn't mean you must use the same kernel image. For example, you can use different compiles to support different hardware. But all your kernels should have the same version number.

The openMosix user tools should be downloaded when you download the openMosix kernel or kernel patches. Additionally, you will also want to download and install openMosixView, third-party tools for openMosix.

5.4 Installing a Precompiled Kernel

The basic steps for installing a precompiled kernel are selecting and downloading the appropriate files and packages, installing those packages, and making a few minor configuration changes.

5.4.1 Downloading

You'll find links to available packages at http://openmosix.sourceforge.net.[3] You'll need to select from among several versions and compilations. At the time this was written, there were half a dozen different kernel versions available. For each of these, there were eight possible downloads, including a README file, a kernel patch file, a source file that contains both a clean copy of the kernel and the patches, and five precompiled kernels for different processors. The precompiled versions are for an Intel 386 processor, an Intel 686 processor, an Athlon processor, Intel 686 SMP processors, or Athlon SMP processors. The Intel 386 is said to be the safest version. The Intel 686 version is for Intel Pentium II and later CPUs. With the exception of the text README file and a compressed (gz) set of patches, the files are in RPM format.

[3] And while you are at it, you should also download a copy of Kris Buytaert's openMosix HOWTO from http://www.tldp.org/HOWTO/openMosix-HOWTO/.

The example that follows uses the package openmosix-kernel-2.4.24-openmosix.i686.rpm for a single processor Pentium II system running Red Hat 9. Be sure you read the README file! While you are at it, you should also download a copy of the latest suitable version of the openMosix user tools from the same site. Again, you'll have a number of choices. You can download binaries in RPM or DEB format as well as the sources. For this example, the file openmosix-tools-0.3.5-1.i386.rpm was used.

Perhaps the easiest thing to do is to download everything at once and burn it to a CD so you'll have everything handy as you move from machine to machine. But you could use any of the techniques described in Chapter 8, or you could use the C3 tools described in Chapter 10. Whatever your preference, you'll need to get copies of these files on each machine in your cluster.

There is one last thing to do before you install梒reate an emergency boot disk if you don't have one. While it is unlikely that you'll run into any problems with openMosix, you are adding a new kernel.

Don't delete the old kernel. As long as you keep it and leave it in your boot configuration file, you should still be able to go back to it. If you do delete it, an emergency boot disk will be your only hope.

To create a boot disk, you use the mkbootdisk command as shown here:

[root@fanny root]# uname -r
2.4.20-6
[root@fanny root]# mkbootdisk \
> --device /dev/fd0 2.4.20-6
Insert a disk in /dev/fd0. Any information on the disk will be lost.
Press <Enter> to continue or ^C to abort:

(The last argument to mkbootdisk is the kernel version. If you can't remember this, use the command uname -r first to refresh your memory.)

5.4.2 Installing

Since we are working with RPM packages, installation is a breeze. Just change to the directory where you have the files and, as root, run rpm.

[root@fanny root]# rpm -vih openmosix-kernel-2.4.24-openmosix1.i686.rpm
Preparing...                ########################################### [100%]
1:openmosix-kernel       ########################################### [100%]
[root@fanny root]# rpm -vih openmosix-tools-0.3.5-1.i386.rpm
Preparing...                ########################################### [100%]
1:openmosix-tools        ########################################### [100%]
Edit /etc/openmosix.map if you don't want to use the autodiscovery daemon.

That's it! The kernel has been installed for you in the /boot directory.

This example uses the 2.4.24-om1 release. 2.4.24-om2 should be available by the time you read this. This newer release corrects several bugs and should be used.

You should also take care to use an openMosix tool set that is in sync with the kernel you are using, i.e., one that has been compiled with the same kernel header files. If you are compiling both, this shouldn't be a problem. Otherwise, you should consult the release notes for the tools.

5.4.3 Configuration Changes

While the installation will take care of the stuff that can be automated, there are a few changes you'll have to do manually to get openMosix running. These are very straightforward.

As currently installed, the next time you reboot your systems, your loader will give you the option of starting openMosix but it won't be your default kernel. To boot to the new openMosix kernel, you'll just need to select it from the menu. However, unless you set openMosix as the default kernel, you'll need to manually select it every time you reboot a system.

If you want openMosix as the default kernel, you'll need to reconfigure your boot loader. For example, if you are using grub, then you'll need to edit /etc/grub.conf to select the openMosix kernel. The installation will have added openMosix to this file, but will not have set it as the default kernel. You should see two sets of entries in this file. (You'll see more than two if you already have other additional kernels). Change the variable default to select which kernel you want as the default. The variable is indexed from 0. If openMosix is the first entry in the file, change the line to setting default so that it reads default=0.

If you are using LILO, the procedure is pretty much the same except that you will need to manually create the entry in the configuration file and rerun the loader. Edit the file /etc/lilo.conf. You can use a current entry as a template. Just copy the entry, edit it to use the new kernel, and give it a new label. Change default so that it matches your new label, e.g., default=openMosix. Save the file and run the command /sbin/lilo -v.

Another issue is whether your firewall will block openMosix traffic. The openMosix FAQ reports that openMosix uses UDP ports in the 5000-5700 range, UDP port 5428, and TCP ports 723 and 4660. (You can easily confirm this by monitoring network traffic, if in doubt.) You will also need to allow any other related traffic such as NFS or SSH traffic. Address this before you proceed with the configuration of openMosix.

In general, security has not been a driving issue with the development of openMosix. Consequently, it is probably best to use openMosix in a restrictive environment. You should either locate your firewall between your openMosix cluster and all external networks, or you should completely eliminate the external connection.

openMosix needs to know about the other machines in your cluster. You can either use the autodiscovery tool omdiscd to dynamically create a map, or you can create a static map by editing the file /etc/openmosix.map (or /etc/mosix.map or /etc/hpc.map on earlier versions of openMosix). omdiscd can be run as a foreground command or as a daemon in the background. Routing must be correctly configured for omdiscd to run correctly. For small, static clusters, it is probably easier to edit /etc/openmosix.map once and be done with it.

For a simple cluster, this file can be very short. Its simplest form has one entry for each machine. In this format, each entry consists of three fields梐 unique device node number (starting at 1) for each machine, the machine's IP address, and a 1 indicating that it is a single machine. It is also possible to have a single entry for a range of machines that have contiguous IP addresses. In that case, the first two fields are the same梩he node number for the first machine and the IP address of the first machine. The third field is the number of machines in the range. The address can be an IP number or a device name from your /etc/hosts file. For example, consider the following entry:

1       fanny.wofford.int       5

This says that fanny.wofford.int is the first of five nodes in a cluster. Since fanny's IP address is 10.0.32.144, the cluster consists of the following five machines: 10.0.32.144, 10.0.32.145, 10.0.32.146, 10.0.32.147, and 10.0.32.148. Their node numbers are 1 through 5. You could use separate entries for each machine. For example,

1       fanny.wofford.int       1
2       george.wofford.int      1
3       hector.wofford.int      1
4       ida.wofford.int         1
5       james.wofford.int       1

or, equivalently

1       10.0.32.144         1
2       10.0.32.145         1
3       10.0.32.146         1
4       10.0.32.147         1
5       10.0.32.148         1

Again, you can use the first of these two formats only if you have entries for each machine in /etc/hosts. If you have multiple blocks of noncontiguous machines, you will need an entry for each contiguous block. If you use host names, be sure you have an entry in your host table for your node that has its actual IP address, not just the local host address. That is, you need lines that look like

127.0.0.1       localhost
172.16.1.1      amy

not

127.0.0.1       localhost   amy

You can list the map that openMosix is using with the showmap command. (This is nice to know if you are using autodiscovery.)

[root@fanny etc]# showmap
My Node-Id: 0x0001
Base Node-Id Address          Count
------------ ---------------- -----
0x0001       10.0.32.144      1
0x0002       10.0.32.145      1
0x0003       10.0.32.146      1
0x0004       10.0.32.147      1
0x0005       10.0.32.148      1

Keep in mind that the format depends on the map file format. If you use the range format for your map file, you will see something like this instead:

[root@fanny etc]# showmap
My Node-Id: 0x0001
Base Node-Id Address          Count
------------ ---------------- -----
0x0001       10.0.32.144      5

While the difference is insignificant, it can be confusing if you aren't expecting it.

There is also a configuration file /etc/openmosix/openmosix.config. If you are using autodiscovery, you can edit this to start the discovery daemon whenever openMosix is started. This file is heavily commented, so it should be clear what you might need to change, if anything. It can be ignored for most small clusters using a map file.

Of course, you will need to duplicate this configuration on each node on your cluster. You'll also need to reboot each machine so that the openMosix kernel is loaded. As root, you can turn openMosix on or off as needed. When you install the user tools package, a script called openmosix is copied to /etc/init.d so that openMosix will be started automatically. (If you are manually compiling the tools, you'll need to copy this script over.) The script takes the arguments start, stop, status, restart, and reload, as you might have guessed. For example,

[root@james root]# /etc/init.d/openmosix status
This is OpenMosix node #5
Network protocol: 2 (AF_INET)
OpenMosix range     1-5     begins at fanny.wofford.int
Total configured: 5

Use this script to control openMosix as needed. You can also use the setpe command, briefly described later in this chapter, to control openMosix.

Congratulations, you are up and running.

5.5 Using openMosix

At its simplest, openMosix is transparent to the user. You can sit back and reap the benefits. But at times, you'll want more control. At the very least, you may want to verify that it is really running properly. (You could just time applications with computers turned on and off, but you'll probably want to be a little more sophisticated than that.) Fortunately, openMosix provides some tools that allow you to monitor and control various jobs. If you don't like the tools that come with openMosix, you can always install other tools such as openMosixView.

5.5.1 User Tools

You should install the openMosix user tools before you start running openMosix. This package includes several useful management tools (migrate, mosctl, mosmon, mosrun, and setpe), an openMosix aware version of ps and top called, suitably, mps and mtop, and a startup script /etc/init.d/openmosix. (This is actually a link to the file /etc/rc.d/init.d/openmosix.)

5.5.1.1 mps and mtop

Both mps and mtop will look a lot like their counterparts, ps and top. The major difference is that each has an additional column that gives the node number on which a process is running. Here is part of the output from mps:

[root@fanny sloanjd]# mps
PID TTY NODE STAT TIME COMMAND
...
19766  ?     0 R    2:32 ./loop
19767  ?     2 S    1:45 ./loop
19768  ?     5 S    3:09 ./loop
19769  ?     4 S    2:58 ./loop
19770  ?     2 S    1:47 ./loop
19771  ?     3 S    2:59 ./loop
19772  ?     6 S    1:43 ./loop
19773  ?     0 R    1:59 ./loop
...

 

As you can see from the third column, process 19769 is running on node 4. It is important to note that mps must be run on the machine where the process originated. You will not see the process if you run ps, mps, top, or mtop on any of the other machines in the cluster even if the process has migrated to that machine. (Arguably, in this respect, openMosix is perhaps a little too transparent. Fortunately, a couple of the other tools help.)

5.5.1.2 migrate

The tool migrate explicitly moves a process from one node to another. Since there are circumstances under which some processes can't migrate, the system may be forced to ignore this command. You'll need the PID and the node number of the destination machine. Here is an example:

[sloanjd@fanny sloanjd]$ migrate 19769 5

 

This command will move process 19769 to node number 5. (You can use home in place of the node number to send a process back to the CPU where it was started.) It might be tempting to think you are reducing the load on node number 4, the node where the process was running, but in a balanced system with no other action, another process will likely migrate to node 4.

5.5.1.3 mosctl

With mosctl, you have greater control over how processes are run on individual machines. For example, you can block the arrival of guest processes to lighten the load on a machine. You can use mosctl with the setspeed option to override a node's idea of its own speed. This can be used to attract or discourage process migration to the machine. mosctl can also be used to display utilization or tune openMosix performance parameters. There are too many arguments to go into here, but they are described in the manpage.

5.5.1.4 mosmon

While mps won't tell you if a process has migrated to your machine, you can get a good idea of what is going across the cluster with the mosmon utility. mosmon is an ncurses-based utility that will display a simple bar graph showing the loads on the nodes in your cluster. This can give you a pretty good idea of what is going on. Figure 5-1 shows mosmon in action.

Figure 5-1. mosmon

figs/hplc_0501.gif

 

 

In this example, eight identical processes are running on a six-node cluster. Obviously, the second and sixth nodes have two processes each while the remaining four machines are each running a single process. Of course, other processes could be mixed into this, affecting an individual machine's load. You can change the view to display memory, speed, and utilization as well as change the layout of the graph. Press h while the program is running to display the various options. Press q to quit the program.

Incidentally, mosmon goes by several different names, including mon and, less commonly, mmon. The original name was mon, and it is often referred to by that name in openMosix documentation. The shift to mosmon was made to eliminate a naming conflict with the network-monitoring tool mon. The local name is actually set by a compile-time variable.

5.5.1.5 mosrun

The mosrun command can also be used to advise the system to run a specific program on a specified node. You'll need the program name and the destination node number (or use -h for the home node). Actually, mosrun is one of a family of commands used to control node allocation preferences. These are listed and described on the manpage for mosrun.

5.5.1.6 setpe

The setpe command can be used to manually configure a node. (In practice, setpe is usually called from the script /etc/init.d/openmosix rather than used directly.) As root, you can use setpe to start or stop openMosix. For example, you could start openMosix with a specific configuration file with a command like

[root@ida sloanjd]# /sbin/setpe -w -f /etc/openmosix.map

 

setpe takes several options including -r to read the configuration file, -c to check the map's consistency, and -off to shut down openMosix. Consult the manpage for more information.

5.5.2 openMosixView

openMosixView extends the basic functionality of the user tools while providing a spiffy X-based GUI. However, the basic user tools must be installed for openMosixView to work. openMosixView is actually seven applications that can be invoked from the main administration application.

If you want to install openMosixView, which is strongly recommended, download the package from http://www.openmosixview.com. Look over the documentation for any dependencies that might apply. Depending on what you have already installed on your system, you may need to install additional packages. For example, GLUT is one of more than two dozen dependences. Fortunately (or annoyingly), rpm will point out to you what needs to be added.

Then, as root, install the appropriate packages.

[root@fanny root]# rpm -vih glut-3.7-12.i386.rpm
warning: glut-3.7-12.i386.rpm: V3 DSA signature: NOKEY, key ID db42a60e
Preparing...                ########################################### [100%]
1:glut                   ########################################### [100%]
[root@fanny root]# rpm -vih openmosixview-1.5-redhat90.i386.rpm
Preparing...                ########################################### [100%]
1:openmosixview          ########################################### [100%]

 

As with the kernel, you'll want to repeat this on every node. This installation will install documentation in /usr/local.

Once installed, you are basically ready to run. However, by default, openMosixView uses RSH. It is strongly recommended that you change this to SSH. Make sure you have SSH set up on your system. (See Chapter 4 for more information on SSH.) Then, from the main application, select the Config menu.

The main applications window is shown in Figure 5-2. You get this by running the command openmosixview in an X window environment.

Figure 5-2. openMosixView

figs/hplc_0502.gif

 

 

This view displays information for each of the five nodes in this cluster. The first column displays the node's status by node number. The background color is green if the node is available or red if it is unavailable. The second column, buttons with IP numbers, allows you to configure individual systems. If you click on one of these buttons, a pop-up window will appear for that node, as shown in Figure 5-3. You'll notice that the configuration options are very similar to those provided by the mosctl command.

Figure 5-3. openMosix configuration window

figs/hplc_0503.gif

 

 

As you can see from the figure, you can control process migration, etc., with this window. The third column in Figure 5-2, the sliders, controls the node efficiencies used by openMosix when load balancing. By changing these, you alter openMosix's idea of the relative efficiencies of the nodes in the cluster. This in turn influences how jobs migrate. Note that the slider settings do not change the efficiency of the node, just openMosix's perception of the node's capabilities. The remaining columns provide general information about the nodes. These should be self-explanatory.

The buttons along the top provide access to additional applications. For example, the third button, which looks like a gear, launches the process viewer openMosixprocs. This is shown in Figure 5-4.

Figure 5-4. openMosixprocs

figs/hplc_0504.gif

 

 

openMosixprocs allows you to view and manage individual processes started on the node from which openMosixprocs is run. (Since it won't show you processes migrated from other systems, you'll need openMosixprocs on each node.) You can select a user in the first entry field at the top of the window and click on refresh to focus in on a single user's processes. By double-clicking on an individual process, you can call up the openMosixprocs-Migrator, which will provide additional statistics and allow some control of a process.

openMosixView provides a number of additional tools that aren't described here. These include a 3D process viewer (3dmosmon), a data collection daemon (openMosixcollector), an analyzer (openMosixanalyzer), an application for viewing process history (openMosixHistory), and a migration monitor and controller (openMosixmigmon) that supports drag-and-drop control on process migration.

5.5.3 Testing openMosix

It is unlikely that you will have any serious problems setting up openMosix. But you may want to confirm that it is working. You could just start a few processes and time them with openMosix turned on and off. Here is the simple C program that can be used to generate some activity.

#include <stdio.h>
int foo(int,int);
int main( void )
{
int i,j;
for (i=1; i<100000; i++)
for (j=1; j<100000; j++)
foo(i,j);
return 0;
}
int foo(int x, int y)
{
return(x+y);
}

 

This program does nothing useful, but it will take several minutes to complete on most machines. (You can adjust the loop count if it doesn't run long enough to suit you.) By compiling this (without optimizations) and then starting several copies running in the background, you'll have a number of processes you can watch.

While timing will confirm that you are actually getting a speedup, you'll get a better idea of what is going on if you run mosmon. With mosmon, you can watch process migration and load balancing as it happens.

If you are running a firewall on your machines, the most likely problem you will have is getting connection privileges correct. You may want to start by disconnecting your cluster from the Internet and disabling the firewall. This will allow you to confirm that openMosix is correctly installed and that the firewall is the problem. You can use the command netstat -a to identify which connections you are using. This should give you some guidance in reconfiguring your firewall.

Finally, an openMosix stress test is available for the truly adventurous. It can be downloaded from http://www.openmosixview.com/omtest/. This web page also describes the test (actually a test suite) and has a link to a sample report. You can download sources or an RPM. You'll need to install expect before installing the stress test. To run the test, you should first change to the /usr/local/omtest directory and then run the script ./openmosix_stress_test.sh. A report is saved in the /tmp directory.

The test takes a while to run and produces a very long report. For example, it took over an hour and a half on an otherwise idle five-node cluster of Pentium II's and produced an 18,224-line report. While most users will find this a bit of overkill for their needs, it is nice to know it is available. Interpretation of the results is beyond the scope of this book.

5.6 Recompiling the Kernel

First, ask yourself why you would want to recompile the kernel. There are several valid reasons. If you normally have to recompile your kernel, perhaps because you use less-common hardware or need some special compile option, then you'll definitely need to recompile for openMosix. Or maybe you just like tinkering with things. If you have a reason, go for it. Even if you have never done it before, it is not that difficult, but the precompiled kernels do work well. For most readers, recompiling the kernel is optional, not mandatory. (If you are not interested in recompiling the kernel, you can skip the rest of this section.)

Before you start, do you have a recovery disk? Are you sure you can boot from it? If not, go make one right now before you begin.

Let's begin by going over the basic steps of a fairly generic recompilation, and then we'll go through an example. First, you'll need to decide which version of the kernel you want to use. Check to see what is available. (You can use the uname -r command to see what you are currently using, but you don't have to feel bound by that.)

You are going to need both a set of patches and a clean set of kernel source files. Accepted wisdom says that you shouldn't use the source files that come with any specific Linux releases because, as a result of customizations, the patches will not apply properly. As noted earlier in this chapter, you can download the kernel sources and patches from http://openmosix.sourceforge.net or you can just download the patches. If you have downloaded just the patches, you can go to http://www.kernel.org to get the sources. You'll end up with the same source files either way.

If you download the source file from the openMosix web site, you'll have an RPM package to install. When you install this, it will place compressed copies of the patches and the source tree (in gzip or bzip2 format) as well as several sample kernel configuration files in the directory /usr/src/redhat/SOURCES. The next step is to unpack the sources and apply the patches.

Using gunzip or bzip2 and then tar, unpack the files in the appropriate directory. Where you put things is largely up to you, but it is a good idea to try to be consistent with the default layout of your system. Move the patch files into the root directory of your source tree. Once you have all the files in place, you can use the patch command to patch the kernel sources.

The next step is to create the appropriate configuration file. In theory, there are four ways you can do this. You could directly edit the default configuration file, typically /usr/src/linux/.config, or you can run one of the commands make config, make menuconfig, or make xconfig. In practice, you should limit yourself to the last two choices. Direct editing of the configuration file for anything other than minor changes is for fools, experts, or foolish experts. And while config is the most universal approach, it is also the most unforgiving and should be used only as a last resort. It streams the configuration decisions past you and there is no going back once you have made a decision. The remaining choices are menuconfig, which requires the ncurses library, and xconfig, which requires X windows and TCL/TK libraries. Both work nicely. Figure 5-5 shows the basic layout with menuconfig.

Figure 5-5. Main menuconfig menu

figs/hplc_0505.gif

 

Configuration parameters are arranged in groups by functionality. The first group is for openMosix. You can easily move through this menu and select the appropriate actions. You will be given a submenu for each group. Figure 5-6 shows the openMosix submenu.

Figure 5-6. openMosix system submenu

figs/hplc_0506.gif

 

xconfig is very similar but has a fancy GUI.

Because there are so many decisions, this is the part of the process where you are most apt to make a mistake. This isn't meant to discourage you, but don't be surprised if you have to go through this process several times. For the most part, the defaults are reasonable. Be sure you select the right processor type and all appropriate file systems. (Look at /etc/fstab, run the mount command, or examine /proc/filesystems to get an idea of what file systems you are currently using.) If you downloaded the sources from the openMosix web page, you have several sample configuration files. You can copy one of these over and use it as your starting point. This will give you some reasonable defaults. You can also get a description of various options (including openMosix options!) by looking in the Documentation/Configure.help file in your source tree. As a general rule of thumb, if you don't need something, don't include it.

Once you have the configuration file, you are ready to build the image. You'll use the commands make dep, make clean, make bzImage, make modules, and make modules_install. (You'll need modules enabled, since openMosix uses them.) If all goes well, you'll be left with a file bzImage in the directory arch/i386/boot/ under your source tree.

The next to last step is to install the kernel, i.e., arrange for the system to boot from this new kernel. You'll probably want to move it to the /boot directory and rename it. Since you are likely to make several kernels once you get started, be sure to use a meaningful name. You may need to create a ram-disk. You also need to configure your boot loader to find the file as described earlier in this chapter. When copying over the new kernel, don't delete the original kernel!

Now you are ready to reboot and test your new kernel. Pay close attention to the system messages when you reboot. This will be your first indication of any configuration errors you may have made. You'll need to go back to the configuration step to address these.

Of course, this is just the kernel you've installed. You'll still need to go back and install the user tools and configure openMosix for your system. But even if you are compiling the kernel, there is no reason you can't use the package to install the user tools.

Here is an example using Red Hat 9. Although Red Hat 9 comes with the 2.4.20 version of the kernel, this example uses a later version of the kernel, openmosix-kernel-2.4.24-openmosix1.src.rpm. The first step is installing this package.

[root@fanny root]# rpm -vih openmosix-kernel-2.4.24-openmosix1.src.rpm
1:openmosix-kernel       ########################################### [100%]
[root@fanny root]# cd /usr/src/redhat/SOURCES
[root@fanny SOURCES]# ls
kernel-2.4.20-athlon.config      kernel-2.4.24-athlon-smp.config
kernel-2.4.20-athlon-smp.config  kernel-2.4.24-i386.config
kernel-2.4.20-i386.config        kernel-2.4.24-i686.config
kernel-2.4.20-i686.config        kernel-2.4.24-i686-smp.config
kernel-2.4.20-i686-smp.config    linux-2.4.24.tar.bz2
kernel-2.4.24-athlon.config      openMosix-2.4.24-1.bz2

As you can see, the package includes the source files, patches, and sample configuration files.

Next, unpack the files. (With some versions, you may need to use gunzip instead of bunzip2.)

[root@fanny SOURCES]# bunzip2 linux-2.4.24.tar.bz2
[root@fanny SOURCES]# bunzip2 openMosix-2.4.24-1.bz2
[root@fanny SOURCES]# mv linux-2.4.24.tar /usr/src
[root@fanny SOURCES]# cd /usr/src
[root@fanny src]# tar -xvf linux-2.4.24.tar
...

The last command creates the directory linux-2.4.24 under /usr/src. If you are working with different versions of the kernel, you probably want to give this directory a more meaningful name.

The next step is to copy over the patch file and, if you desire, one of the sample configuration files. Then, you can apply the patches.

[root@fanny src]# cd /usr/src/redhat/SOURCES
[root@fanny SOURCES]# cp openMosix-2.4.24-1 /usr/src/linux-2.4.24/
[root@fanny SOURCES]# cp kernel-2.4.24-i686.config \
> /usr/src/linux-2.4.24/.config
[root@fanny SOURCES]# cd /usr/src/linux-2.4.24
[root@fanny linux-2.4.24]# cat openMosix-2.4.24-1 | patch -Np1
...

You should see a list of the patched files stream by as the last command runs.

Next, you'll need to create or edit a configuration file. This example uses the supplied configuration file that was copied over as a starting point.

[root@fanny linux-2.4.24]# make menuconfig

Make whatever changes you need and then save your new configuration.

Once configured, it is time to make the kernel.

[root@fanny linux-2.4.24]# make dep
...
[root@fanny linux-2.4.24]# make clean
...
[root@fanny linux-2.4.24]# make bzImage
...
[root@fanny linux-2.4.24]# make modules
...
[root@fanny linux-2.4.24]# make modules_install
...

These commands can take a while and produce a lot of output, which has been omitted here.

The worst is over now. You need to copy your kernel to /boot, create a ram-disk, and configure your boot loader.

[root@fanny linux-2.4.24]# cd /usr/src/linux-2.4.24/arch/i386/boot/
[root@fanny boot]# cp bzImage /boot/vmlinuz-8jul04

If you haven't changed kernels, you may be able to use the existing ram-disk. Otherwise, use the mkinitrd script to create a new one.

[root@fanny boot]# cd /boot
[root@fanny boot]# mkinitrd /boot/initrd-2.4.24.img 2.4.24-om

The first argument is the name for the ram-disk and the second argument is the appropriate module directory under /lib/modules. See the manpage for details.

The last step is to change the boot loader. This system uses grub, so the file /etc/grub.conf needs to be edited. You might add something like the following:

title My New openMosix Kernel
root (hd0,0)
kernel /vmlinuz-8jul04 ro root=LABEL=/
initrd /initrd-2.4.24.img

When the system reboots, the boot menu now has My New openMosix Kernel as an entry. Select that entry to boot to the new kernel.

While these steps should be adequate for most readers, it is important to note that, depending on your hardware, etc., additional steps may be required. Fortunately, there has been a lot written on the general process of recompiling Linux kernels. See the Appendix A for pointers to more information.

5.7 Is openMosix Right for You?

openMosix has a lot to recommend it. Not having to change your application code is probably the biggest advantage. As a control mechanism, it provides both transparency to the casual user and a high degree of control for the more experienced user. With precompiled kernels, setup is very straightforward and goes quickly.

There is a fair amount of communication overhead with openMosix, so it works best on high-performance networks, but that is true of any cluster. It is also more operating system-specific than most approaches to distributed computing. For a high degree of control for highly parallel code, MPI is probably a better choice. This is particularly true if latency becomes an issue. But you should not overlook the advantages of using both MPI and openMosix. At the very least, openMosix may improve performance by migrating processes to less-loaded nodes.

There are a couple of other limitations to openMosix that are almost unfair to mention since they are really outside the scope of the openMosix project. The first is the inherit granularity attached to process migration. If your calculation doesn't fork off processes, much of the advantage of openMosix is lost. The second limitation is a lack of scheduling control. Basically, openMosix deals with processes as it encounters them. It is up to the user to manage scheduling or just take what comes. Keep in mind that if you are using a scheduling program to get very tight control over your resources, openMosix may compete with your scheduler in unexpected ways.

In looking at openMosix, remember that it is a product of an ongoing and very active research project. Any description of openMosix is likely to become dated very quickly. By the time you have read this, it is likely that openMosix will have evolved beyond what has been described here. This is bad news for writers like me, but great news for users. Be sure to consult the openMosix documentation.

If you need to run a number of similar applications simultaneously and need to balance the load among a group of computers, you should consider openMosix.

Chapter 6. OSCAR

Setting up a cluster can involve the installation and configuration of a lot of software as well as reconfiguration of the system and previously installed software. OSCAR (Open Source Cluster Application Resources) is a software package that is designed to simplify cluster installation. A collection of open source clus

抱歉!评论已关闭.