Home > Cluster > Creating a Red Hat Cluster: Part 5

Creating a Red Hat Cluster: Part 5

Print Friendly, PDF & Email

Welcome back to LINternUX, for our last article of this series on how to build a working Red Hat cluster. So far we have a working cluster, but it only move the IP from server to server. In this article, we will put in place everything so that we have an FTP and a web service that will be fully redundant within our cluster. In our previous article, we have create a GFS filesystem under the mount point “/cadmin”, this is where we will put our scripts, configuration files and log used for our cluster. The content of the “/cadmin” filesystem can be downloaded here, it include all the directories structure and scripts used in our cluster articles. After this article, you will have a fully configured cluster, running an ftp and a web service. We will have a lot to do, so let’s begin.

 

FTP prerequisite

We need to make sure that the ftp server “vsftpd” is installed on every server in our cluster. You can check if it is installed by typing the following command ;

root@gandalf:~# rpm -q vsftpd
vsftpd-2.0.5-16.el5_5.1
root@gandalf:~#

If is not installed, we need to run the following command to instal it on the servers where it’s not installed ;

root@bilbo:~# yum install vsftpd

We must make sure the vsftpd is not started and doesn’t start upon reboot. To do so use the following commands on all servers;

root@bilbo:~# service vsftpd stop
Shutting down vsftpd:                                      [FAILED]
root@bilbo:~# chkconfig vsftpd off

Script to stop/start/status our FTP service

Now we need to create a script for each of our services (ftp and web) that the cluster software will use to stop and start the appropriate service and add it to our cluster configuration. We’ll put these scripts if our /cadmin GFS filesystem, so it’s accessible by our 3 servers. We will start by creating the script for the ftp service. The script used by the Red Hat Cluster Suite,  receive one parameter when called by the cluster software. These parameter can be “stop”, “start” and “status”.  You can download a copy of the script and the vsftpd configuration file if you want. but remember that if you want to use them as is, you must put them in the /cadmin filesystem. The “srv_ftp.sh” script will go in a subdirectory name “/cadmin/srv” and the configuration file “srv_ftp.conf” must go in “/cadmin/cfg” directory. But nothing beat an example, let’s built the one for our FTP service.

#! /bin/bash
# ---------------------------------------------------------------------------------
# Script to stop/start and give a status of ftp service in the cluster.
# This script is build to receive 3 parameters.
#    - start :  Executed by cluster to start the application(s) or service(s)
#    - stop  :  Executed by cluster to stop  the application(s) or service(s)
#    - status:  Executed by cluster every 30 seconds to check service status.
# ---------------------------------------------------------------------------------
# Author    : Jacques Duplessis - April 2011
# ---------------------------------------------------------------------------------
#set -x
CDIR="/cadmin"              ; export CDIR       # Root directory for Services
CSVC="$CDIR/srv"            ; export CSVC       # Service Scripts Directory
CCFG="$CDIR/cfg"            ; export CCFG       # Service Config. Directory
INST="srv_ftp"              ; export INST       # Service Instance Name
LOG="$CDIR/log/${INST}.log" ; export LOG        # Service Log file name
HOSTNAME=`hostname -a`      ; export HOSTNAME   # HostName
VSFTPD="/usr/sbin/vsftpd"   ; export VSFTPD     # Service Program name
FCFG="${CCFG}/${INST}.conf" ; export FCFG       # Service Config. file name
RC=0                        ; export RC         # Service Return Code
DASH="---------------------"; export DASH       # Dash Line

# Where the Action Start
# ---------------------------------------------------------------------------------
case "$1" in
  start)  echo -e "\n${DASH}" >> $LOG 2>&1
          echo -e "Starting service $INST on $HOSTNAME at `date`" >> $LOG 2>&1
          echo -e "${VSFTPD} ${FCFG}" >> $LOG 2>&1
          ${VSFTPD} ${FCFG} >> $LOG 2>&1
          RC=$?
          FPID=`ps -ef |grep -v grep |grep ${FCFG} |awk '{ print $2 }'|head -1`
          echo "Service $INST started on $HOSTNAME - PID=${FPID} RC=$RC">> $LOG
          echo "${DASH}" >> $LOG 2>&1
          ;;
  stop )  echo -e "\n${DASH}" >> $LOG 2>&1
          echo -e "Stopping Service $INST on $HOSTNAME at `date` " >> $LOG
          ps -ef | grep ${FCFG}| grep -v grep >> $LOG 2>&1
          FPID=`ps -ef |grep -v grep |grep ${FCFG} |awk '{ print $2 }'|head -1`
          echo -e "Killing PID ${FPID}" >> $LOG 2>&1
          kill $FPID  >> $LOG 2>&1
          echo -e "Service $INST is stopped ..." >> $LOG 2>&1
          RC=0
          echo "${DASH}" >> $LOG 2>&1
          ;;
  status) COUNT=`ps -ef | grep ${FCFG}| grep -v grep | wc -l`
          FPID=`ps -ef |grep -v grep |grep ${FCFG} |awk '{ print $2 }'|head -1`
          echo -n "`date` Service $INST ($COUNT) on $HOSTNAME">> $LOG 2>&1
          if [ $COUNT -gt 0 ]
             then echo " - PID=${FPID} - OK"  >> $LOG 2>&1
                  RC=0
             else echo " - NOT RUNNING" >> $LOG 2>&1
                  ps -ef | grep -i ${FCFG} | grep -v grep  >> $LOG 2>&1
                  RC=1
          fi
          ;;
esac
exit $RC

This script is placed in the directory “/cadmin/svc” and name “svc_ftp.sh”. Now, let’s add this script to our ftp service, run the “system-config-cluster” command to start the cluster configuration GUI.

 

Add ftp script to our ftp cluster service

Now, let’s add this script to our ftp service, run the “system-config-cluster” command.

root@gandalf:~# system-config-cluster &

 

Click on “Resources” on the left side and then on  the  “Create a Resource”‘ button at the bottom right of the screen. This will allow us insert our ftp service script into the cluster configuration.

 

 

 

 

 

 

 

 

 

 

 

Select “Script” from the “Resource Type” list and then enter the name of our ressource “srv_ftp”  and then specify the name of the script the service will use, with it’s full path. Here, like I said we decided to place it in our “/cadmin” GFS filesystem so it is seen by every node in the cluster.

 

 

Now we need to edit our “srv_ftp” service to add the resource we just created.

 

Select the “srv_ftp” service at the bottom left of the screen and then press the “Edit Service Properties” button.

 

 

 

 

Click on the “Add a Shared Resource to this service” button. This will bring up the screen below, where we select the “srv_ftp” script that we want to add the our service.

 

 

 

 

 

 

 

 

 

 

 

 

After adding our script to the resource, press the “Close” button.

We are now ready to push our new configuration to the member of our cluster, press the “Send to Cluster” button to do so.

 

 

 

 

Web site prerequisite

Make sure that the “httpd” and the “php” package is installed on every server in our cluster. You can check if it is installed by typing the following command ;

root@gandalf # rpm -q httpd php
httpd-2.2.3-45.el5
php-5.1.6-27.el5_5.3
root@gandalf #

#

If is not installed, we need to run the following command to instal them on the servers where it’s not installed ;

root@bilbo:~# yum install httpd php

We must make sure the “httpd” is not started and doesn’t start upon reboot. To do so use the following commands on all servers;

root@bilbo:~# service httpd stop
Shutting down httpd:                                      [FAILED]
root@bilbo:~# chkconfig httpd off

Script to stop/start/status our Web service

We have simplify the configuration of our web site to the minimum. This was done intentionnaly, we wanted to demonstrate the cluster functionnaly and not the “httpd” possibilites. But our web site will be functionnal and redundant. As with the ftp script, the function of our web server script is very similar. You can download this script and the httpd configuration file if you want, but remember that if you want to use them as is, you must put them in the /cadmin filesystem. The “srv_www.sh” script will go in a subdirectory name “/cadmin/srv” and the configuration file “srv_www.conf” must go in “/cadmin/cfg” directory.

#! /bin/bash
# ---------------------------------------------------------------------------------
# Script to stop/start and give a status of our web service in the cluster.
# This script is build to receive 3 parameters.
#    - start :  Executed by cluster to start the application(s) or service(s)
#    - stop  :  Executed by cluster to stop  the application(s) or service(s)
#    - status:  Executed by cluster every 30 seconds to check service status.
# ---------------------------------------------------------------------------------
# Author    : Jacques Duplessis - April 2011
# ---------------------------------------------------------------------------------
#set -x
CDIR="/cadmin"              ; export CDIR       # Root directory for Services
CSVC="$CDIR/srv"            ; export CSVC       # Service Scripts Directory
CCFG="$CDIR/cfg"            ; export CCFG       # Service Config. Directory
INST="srv_www"              ; export INST       # Service Instance Name
LOG="$CDIR/log/${INST}.log" ; export LOG        # Service Log file name
HOSTNAME=`hostname -a`      ; export HOSTNAME   # HostName
HTTPD="/usr/sbin/httpd"     ; export HTTPD      # Service Program name
HCFG="${CCFG}/${INST}.conf" ; export HCFG       # Service Config. file name
RC=0                        ; export RC         # Service Return Code
DASH="---------------------"; export DASH       # Dash Line

# Where the Action Start
# ---------------------------------------------------------------------------------
case "$1" in
 start)  echo -e "\n${DASH}" >> $LOG 2>&1
         echo -e "Starting service $INST on $HOSTNAME at `date`" >> $LOG 2>&1
         echo -e "${HTTPD} ${HCFG}" >> $LOG 2>&1
         ${HTTPD} -f ${HCFG} >> $LOG 2>&1
         RC=$?
         HPID=`cat ${CCFG}/${INST}.pid`
         echo "Service $INST started on $HOSTNAME - PID=${HPID} RC=$RC">> $LOG
         echo "${DASH}" >> $LOG 2>&1       
         ;;
 stop )  echo -e "\n${DASH}" >> $LOG 2>&1
         echo -e "Stopping Service $INST on $HOSTNAME at `date` " >> $LOG
         HPID=`cat ${CCFG}/${INST}.pid`
         echo -e "Killing PID ${HPID}" >> $LOG 2>&1
         kill $HPID  > /dev/null 2>&1
         echo -e "Service $INST is stopped ..." >> $LOG 2>&1
         RC=0
         echo "${DASH}" >> $LOG 2>&1       
         ;;
 status) COUNT=`ps -ef | grep ${HCFG}| grep -v grep | wc -l`
         HPID=`cat ${CCFG}/${INST}.pid`
         echo -n "`date` Service $INST ($COUNT) on $HOSTNAME">> $LOG 2>&1
         if [ $COUNT -gt 0 ]
            then echo " - PID=${HPID} - OK"  >> $LOG 2>&1
                 RC=0
            else echo " - NOT RUNNING" >> $LOG 2>&1
                 ps -ef | grep -i ${HCFG} | grep -v grep  >> $LOG 2>&1
                 RC=1
         fi
         ;;
esac
exit $RC

Updating our cluster Configuration

To add our web service, please follow the same sequence as we did when we inserted our ftp service into the cluster configuration. You only need to replace “srv_ftp.sh” by “srv_www.sh” and the script path will be the same, we have decide to place our scripts into the directory “/cadmin/srv”. Once we have push the new configuration to all servers in the cluster, we should now have a working cluster.The web site define in the configuration have its “Root Directory” set to “/cadmin/www/html” it contains only one file that will display the name of the it is running on. The will help us testing our cluster configuration.

I you wish to use the cluster configuration, scripts and configuration files we have used in this series of articles, I would encourage you to download the “cadmin.tar” file. The file is the actual content of the “/cadmin” directory used throught out this article. To use it, download the “cadmin.tar” file then copy it to your “/cadmin” directory and enter the command “tar -xvf ./cadmin.tar”. This will explode the tar file and then you will have the working envirionnment I used in this article.

Testing our ftp service

So here we are (finally you would say … hum me too), we have now a fully working cluster. So if we issue the “clustat” command this is what we should see.

root@gollum:/# clustat
Cluster Status for our_cluster @ Sat Apr 16 11:37:25 2011
Member Status: Quorate

 Member Name                        ID   Status
 ------ ----                       ---- ------
 hbbilbo.maison.ca                    1 Online, rgmanager
 hbgandalf.maison.ca                  2 Online, rgmanager
 hbgollum.maison.ca                   3 Online, Local, rgmanager

 Service Name                     Owner (Last)            State        
 ------- ----                     ----- ------            -----        
 service:srv_ftp                  hbgollum.maison.ca      started      
 service:srv_www                  hbbilbo.maison.ca       started      
root@gollum:/#

From the information above, we can see that all our cluster member status are online and that the resource manager is running on all of them. The resource manager is important, it is responsable for moving service around when needed. Our service “srv_ftp” is started (running) on the “hbgollum” server and “srv_www” is running on the “hbbilbo” like we decided at the beginning (remember ?

root@gollum:/# ip addr show | grep 192
    inet 192.168.1.104/24 brd 192.168.1.255 scope global eth0
    inet 192.168.1.204/24 scope global secondary eth0
root@gollum:/#
root@gollum:/# ps -ef | grep vsftpd | grep -v grep
root      7858     1  0 10:05 ?        00:00:00 /usr/sbin/vsftpd /cadmin/cfg/srv_ftp.conf
root@gollum:/#

The command “ip addr show | grep 192” is confirming that the virtual IP is defined on “hbgollum” server and if we check if the ftp process is also running, we can see that it is. So let’s try to do an FTP to our virtual IP that we have name “ftp.maison.ca” (192.168.1.204). We will try it from the “gandalf” server and we see that it is working.

root@gandalf:/# ftp ftp.maison.ca
Connected to ftp.maison.ca.
220 ftp.maison.ca
530 Please login with USER and PASS.
530 Please login with USER and PASS.
KERBEROS_V4 rejected as an authentication type
Name (ftp.maison.ca:root):

 

Now let’s move the ftp service from “hbgollum” to “hbbilbo”, to see if the ftp service will continue to work. To move the service we will use the “clusvcadm” command, we need to specify the service name we need to relocate (-r) and the server (-m for machine) we wish to move it. You can issue the “clusvccmd” command on any of the server within our cluster. So enter the following command to move our service to “hbbilbo” ;

root@gandalf:/# clusvcadm -r srv_ftp -m hbbilbo
'hbbilbo' not in membership list
Closest match: 'hbbilbo.maison.ca'
Trying to relocate service:srv_ftp to hbbilbo.maison.ca...Success
service:srv_ftp is now running on hbbilbo.maison.ca
root@gandalf:/#

Notice that just after pressing [Enter], we got “‘hbbilbo’ nit in membership list”, this is because we did not mention the domain name ‘maison.ca’, but it managed to assume that we were refering to “hbbilbo.maison.ca”. So our command succeeded, so let’s see if everything went like it should have.

Fisrt, let’s execute the clustat command to see in the “srv_ftp” service is now running on ‘hbbilbo’.

root@gandalf:/# clustat 
Cluster Status for our_cluster @ Sat Apr 16 12:10:01 2011
Member Status: Quorate

 Member Name                      ID   Status
 ------ ----                      ---- ------
 hbbilbo.maison.ca                    1 Online, rgmanager
 hbgandalf.maison.ca                  2 Online, Local, rgmanager
 hbgollum.maison.ca                   3 Online, rgmanager

 Service Name            Owner (Last)            State
 ------- ----            ----- ------            -----
 service:srv_ftp         hbbilbo.maison.ca       started      
 service:srv_www         hbbilbo.maison.ca       started
root@gandalf:/#

We can see that our ftp service is now running on “hbbilbo”, let’s see if it reallly is. If we check if the 192.168.1.204 (ftp.maison,ca) is now defined on “hbbilbo’ we can see that it is. The FTP process is also running now on the server.

root@bilbo:~# ip addr show | grep 192
    inet 192.168.1.111/24 brd 192.168.1.255 scope global eth0
    inet 192.168.1.211/24 scope global secondary eth0
    inet 192.168.1.204/24 scope global secondary eth0
root@bilbo:~# ps -ef | grep vsftpd | grep -v grep
root      8616     1  0 12:05 ?        00:00:00 /usr/sbin/vsftpd /cadmin/cfg/srv_ftp.conf
root@bilbo:~#

But what happen on ‘hbgollum’, the IP 192.168.1.204 should not be there anymore and the FTP process should not be running anymore. So that’s what happen the IP is gone and the ftp process is no longer running. So far so good, the move to ‘hbbilbo’ server have worked.

root@gollum:/etc/profile.d# ip addr show | grep 192
    inet 192.168.1.104/24 brd 192.168.1.255 scope global eth0
root@gollum:/etc/profile.d# ps -ef | grep vsftpd | grep -v grep
root@gollum:/etc/profile.d#

The last test if to try to do an ftp to ftp.maison.ca and see if it respond.

root@gandalf:/# ftp ftp.maison.ca
Connected to ftp.maison.ca.
220 ftp.maison.ca
530 Please login with USER and PASS.
530 Please login with USER and PASS.
KERBEROS_V4 rejected as an authentication type
Name (ftp.maison.ca:root):

Great evrything worked !! Let’s move the ftp process back to ‘hbgollum’ before testing our web site. Open another terminal window and enter the “clustat -i 2” command and see watch the status change from ‘started/stopping/starting/started’ while the move is going on. Check your /var/log/message and familiar yourself with the line recorded when the move happen.

root@gandalf:/# clusvcadm -r srv_ftp -m hbgollum
'hbgollum' not in membership list
Closest match: 'hbgollum.maison.ca'
Trying to relocate service:srv_ftp to hbgollum.maison.ca...Success
service:srv_ftp is now running on hbgollum.maison.ca
root@gandalf:/#

One of the test we should make, is to unplug the network cable or poweroff “hbgollum” and see if the service move to the next server in the failover domain (It will). So we have now completed and tested our ftp service. It as been a long road but it woth it, no ?

 

 

Testing our web service

You know now, how to move service from server to another, Let’s do the same test with our web server service. The web site is actually just one simple page. It just display the name of the server that it is running on, this simplify our testing.

If we issue the “clustat” command we should have the following picture ;

# clustat
Cluster Status for our_cluster @ Sat Apr 16 14:15:37 2011
Member Status: Quorate

Member Name                      ID   Status
------ ----                      ---- ------
hbbilbo.maison.ca                    1 Online, rgmanager
hbgandalf.maison.ca                  2 Online, Local, rgmanager
hbgollum.maison.ca                   3 Online, rgmanager

Service Name            Owner (Last)            State
------- ----            ----- ------            -----
service:srv_ftp         hbgollum.maison.ca      started
service:srv_www         hbbilbo.maison.ca       started

Let’s see if it is working, open your browser type this URL “http://www.maison.ca” , you should have a the following ;

Now, let’s move the wev site to “gandalf” server, type the following command ;

root@gollum:/cadmin/cfg# clusvcadm -r srv_www -m hbgandalf
'hbgandalf' not in membership list
Closest match: 'hbgandalf.maison.ca'
Trying to relocate service:srv_www to hbgandalf.maison.ca...Success
service:srv_www is now running on hbgandalf.maison.ca
root@gollum:/cadmin/cfg#
root@gollum:/cadmin/cfg# clustat
Cluster Status for our_cluster @ Sat Apr 16 14:27:14 2011
Member Status: Quorate

 Member Name                         ID   Status
 ------ ----                         ---- ------
 hbbilbo.maison.ca                       1 Online, Local, rgmanager
 hbgandalf.maison.ca                     2 Online, rgmanager
 hbgollum.maison.ca                      3 Online, rgmanager

 Service Name               Owner (Last)               State
 ------- ----               ----- ------               -----
 service:srv_ftp            hbgollum.maison.ca         started
 service:srv_www            hbgandalf.maison.ca        started 

 

We can see that the web site is now running on “gandalf” server.

 

 

Disabling and Enabling Services

There may come a time, when you need to stop a service completely. We will demonstrate how to acheive that, first let’s display the status of our cluster

root@bilbo:~# clustat
Cluster Status for our_cluster @ Sat Apr 16 14:39:48 2011
Member Status: Quorate

 Member Name                                  ID   Status
 ------ ----                                  ---- ------
 hbbilbo.maison.ca                                1 Online, Local, rgmanager
 hbgandalf.maison.ca                              2 Online, rgmanager
 hbgollum.maison.ca                               3 Online, rgmanager

 Service Name                        Owner (Last)                        State
 ------- ----                        ----- ------                        -----
 service:srv_ftp                     hbgollum.maison.ca                  started
 service:srv_www                     hbgandalf.maison.ca                 started  

We are going to disable the “srv_www” service, to do so enter the following command ;

root@bilbo:~# clusvcadm -d srv_www 
Local machine disabling service:srv_www...Success

The clustat command show us that the service is now disable.

root@bilbo:~# clustat
Cluster Status for our_cluster @ Sat Apr 16 14:40:04 2011
Member Status: Quorate

 Member Name                                  ID   Status
 ------ ----                                  ---- ------
 hbbilbo.maison.ca                                1 Online, Local, rgmanager
 hbgandalf.maison.ca                              2 Online, rgmanager
 hbgollum.maison.ca                               3 Online, rgmanager

 Service Name                        Owner (Last)                        State
 ------- ----                        ----- ------                        -----
 service:srv_ftp                     hbgollum.maison.ca                  started
 service:srv_www                     (hbgandalf.maison.ca)               disabled

We will now enable the service, but this time we will enable it on another server than “hbgandalf”. This command enable the “srv_www” service on the server “hbbilbo”.

root@bilbo:~# clusvcadm -e srv_www -m hbbilbo
'hbbilbo' not in membership list
Closest match: 'hbbilbo.maison.ca'
Member hbbilbo.maison.ca trying to enable service:srv_www...Success
service:srv_www is now running on hbbilbo.maison.ca
root@bilbo:~#

We can see the it is now running on “hbbilbo”.

root@bilbo:~# clustat
Cluster Status for our_cluster @ Sat Apr 16 14:47:36 2011
Member Status: Quorate

 Member Name                                  ID   Status
 ------ ----                                  ---- ------
 hbbilbo.maison.ca                                1 Online, Local, rgmanager
 hbgandalf.maison.ca                              2 Online, rgmanager
 hbgollum.maison.ca                               3 Online, rgmanager

 Service Name                        Owner (Last)                        State
 ------- ----                        ----- ------                        -----
 service:srv_ftp                     hbgollum.maison.ca                  started
 service:srv_www                     hbbilbo.maison.ca                   started  
root@bilbo:~#

 

 

This conclude our implementation of a small cluster. It was intended just to show every one how the Red Hat Cluster Suite actually work and to give a brief overview how it work. We will now move on to other interesting topic. Don’t know what it will be, but I can assure that it should fit into one article, So I hope you appreciate it and hope to see you soon.

 

Part 1 – Creating a Linux ReadHat/CentOS cluster

Part 2 – Creating a Linux ReadHat/CentOS cluster

Part 3 – Creating a Linux ReadHat/CentOS cluster

Part 4 – Creating a Linux ReadHat/CentOS cluster

Part 5 – Creating a Linux ReadHat/CentOS cluster

 

Categories: Cluster