ShaoLin Microsystems  
The Enterprise Linux Solutions Expert
Corporate Products Services Support Partners
Download  Contact Us
 

10.1. Making changes to cluster configurations

After you have made changes of the cluster configurations. You have to regenerate the cluster boot images to make your new configurations readable by the standby cluster. If your standby cluster is on-line, you will have to reboot your standby server to take effect. See Section 10.1.2 for booting the standby cluster in test mode. Before you power up the standby cluster with the new changes, you have to run the slhactl -r to reload the changes on the active cluster. You may also have to edit your boot loader settings to use the new cluster boot image if the newly generated cluster boot image uses a different name other than your current boot loader's configuration. The default cluster boot image is named /boot/slha-initrd-<kernel-version>.img (e.g. /boot/slha-initrd-2.4.19-4GB.img). If you are using the Webmin cluster administration interface (Cluster Configurator), boot loader configurations are done automatically. If you want to do it manually, you can run the command slhamkinitrd See more information on Section 10.1.1.

10.1.1. Boot loader configuration

The most common boot loader used in Linux is GRUB and LILO. This section will describe the procedure on updating these boot loader configurations. If you are using a different boot loader other than these two, please configure your boot loader according to your reference documents. ShaoLin HA Cluster uses a standard Linux boot procedure, it should work on any type of Linux boot loaders that supports the initrd mechanism. To determine which boot loader you are using, you can check the existence of the LILO configuration file at /etc/lilo.conf. Or you may check for Grub, where the grub configuration file at /boot/grub/menu.lst or /boot/grub/grub.conf.

10.1.1.1. Configuring GRUB

Edit the boot loader configuration file /boot/grub/menu.lst, add new entries for the cluster boot image using the generated initrd image file. Copy existing lines of your standard boot image configurations and change the line initrd to use the newly generated cluster enabled initrd image file name. You may want to add one more entry into GRUB to make a test mode such that it will pass a kernel parameter slha=test to force the cluster enter a standby mode without the ability to failover. You can use this test mode to do configuration verifications, please see Section 10.1.2 for more information on testing the clusters. Here is an example of the GRUB configuration file. For information on GRUB configuration file, please see the manual pages of GRUB by man grub or visit the GRUB official webisite.


default 2
timeout=10
splashimage=(hd0,0)/grub/splash.xpm.gz
title Red Hat Linux (2.4.7-10smp)
root (hd0,0)
kernel /vmlinuz-2.4.7-10smp ro root=/dev/sda2
initrd /initrd-2.4.7-10smp.img
title Red Hat Linux-up (2.4.7-10)
root (hd0,0)
kernel /vmlinuz-2.4.7-10 ro root=/dev/sda2
initrd /initrd-2.4.7-10.img

# Start added by ShaoLin HA Cluster
# Please don't delete the line above
title Shaolin HA Cluster
root (hd0,0)
kernel /vmlinuz-2.4.7-10smp ro root=/dev/sda2
initrd /slha-initrd-2.4.7-10smp.img
title Shaolin HA Cluster (Test Mode)
root (hd0,0)
kernel /vmlinuz-2.4.7-10smp ro root=/dev/sda2 slha=test
initrd /slha-initrd-2.4.7-10smp.img
# Please don't delete the line below
# End added by ShaoLin HA Cluster

10.1.1.2. Configuring LILO

Edit the boot loader configuration fiel /etc/lilo.conf, add new entries for the cluster initrd images together with a test mode.


prompt
timeout=50
default=linux
boot=/dev/sda
map=/boot/map
install=/boot/boot.b
message=/boot/message
linear

image=/boot/vmlinuz-2.4.7-10smp
label=linux
initrd=/boot/initrd-2.4.7-10smp.img
read-only
root=/dev/hda2

# Start added by ShaoLin HA Cluster
# Please don't delete the line above
image=/boot/vmlinuz-2.4.7-10smp
label=linux
initrd=/boot/slha-initrd-2.4.7-10smp.img
read-only
root=/dev/sda2
image=/boot/vmlinuz-2.4.7-10smp
label=linux
initrd=/boot/slha-initrd-2.4.7-10smp.img
read-only
root=/dev/sda2
append="slha=test"
# Please don't delete the line below
# End added by ShaoLin HA Cluster

After you have modified LILO configuration file, you will need to run the command lilo to update your boot record. Please make sure you run this command without any errors, fail to do so may cause your machine unable to boot. For more information, please see the manual pages of lilo.conf by man lilo.conf and the manual pages for LILO by man lilo. You may also want to read the LILO mini-HOWTO for more information.

10.1.2. Testing Cluster Configurations

Every time you made changes to the cluster, you would like to verify the cluster configurations before your new changes to take effect to the product system. ShaoLin HA Cluster allow you to make changes to the production cluster system, and suspend the failover procedures for testing purposes. You should boot your standby cluster to a mode known as suspend or test mode for testing. The selection between test mode and production during system boot time is configured through the boot loader boot labels. For further information about boot loaders, please see Section 10.1.1 for more details.

After you have made changes to the cluster configurations, you may want to verify your setup correctness. After you have generated the new cluster boot images and reloaded the new configurations, you should restart your standby cluster or power up your standby cluster. Upon startup (in your boot loader menu), you should select to start the standby cluster in test mode , or you can simple add slha=maintenance into the kernel command line of your boot loader to force the cluster to boot into a standby cluster and disabling failover. It is dangerous to not test your standby cluster after making changes, since your system might be malfunction with incorrect setup or will failover without stopping the active cluster and thus causing data corruption to the shared storage.

In the test mode, cluster manager will be invoked and you should be able to see the heartbeat and services information of the cluster system. If you see any lost if heartbeat (i.e. a non zero value in the lost field), you should go back and check your connections or the service that is reporting error.