Home > Cluster > Creating a Red Hat Cluster: Part 4

Creating a Red Hat Cluster: Part 4

Print Friendly, PDF & Email

Welcome back to LINternUX, where we continue the creation of our cluster. By now you should have a working cluster running an ftp service and a web service. Although the service are created, our ftp and web service are not really running yet. In this article we will create a GFS filesystem that will allows us to share data between nodes. In the next and last article we’ll finalise the cluster by completing our ftp and web service so they really work. We will also show you how to manually move service from one server to another. So we still have some work to do, so let’s start right away.

 

Adding a SAN disk to our servers

The Linux operating system is installed on the internal disks for each of our server. We will now add a SAN disk that will be visible be each of our server. I assume here that your SAN and your Brocade switch are configure accordingly. Explaining how to set up the SAN and the Brocade switch is not in the scope of this article. But I think that you get the idea that the new disk must be visible by every node in our cluster. In the example below we already have a SAN disk (sda) with one partition (sda1) on it. Adding a disk to the server, can be done (live) without any interruption of service, if you follow the steps below. I would suggest you practice on a test server, to become familiar with the procedure.

Before we add a disk, let’s see what are the visible disks on the system, by looking at the /proc/partitions file. We can see that we already have a disk (sda) with one partition on it. So the new disk that we’re going to add, should be seen as “sdb”.

root@gollum~# grep sd /proc/partitions
8     0  104857920 sda
8     1  104856223 sda1

Let’s rescan the SCSI bus by typing the command below. This command must be run on each of the server within the cluster. Here, we have only one HBA (Host Base Adapter)  card connected to the SAN on each server. If you have a second HBA, you need to run the same command for the second HBA, but replace the “host0” by “host1”.

root@gollum~#  echo “- – -” > /sys/class/scsi_host/host0/scan
root@gandalf~#  echo “- – -” > /sys/class/scsi_host/host0/scan
root@bilbo~#     echo “- – -” > /sys/class/scsi_host/host0/scan

Let’s see if we have some new disk(s) that were detected (sdb) (check each servers)

root@gollum~# grep sd /proc/partitions
8     0  104857920 sda
8     1  104856223 sda1
8    16   15728640 sdb

Let’s create a LVM partition on our new disk (sdb) by running the “fdisk” command..

# fdisk /dev/sdb

Command (m for help): p

Disk /dev/sdh: 107.3 GB, 107374510080 bytes
255 heads, 63 sectors/track, 13054 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot      Start         End      Blocks   Id  System

Command (m for help): n (n=new partition)
Command action
e   extended
p   primary partition (1-4) p (p=primary partition)
Partition number (1-4): 1 (first partition =1)
First cylinder (1-13054, default 1):  1 (Start at the beginning of disk)
Last cylinder or +size or +sizeM or +sizeK (1-13054, default 13054): 13054 (End of the Disk)
Command (m for help): p (p=Print partition information)

Disk /dev/sdh: 107.3 GB, 107374510080 bytes
255 heads, 63 sectors/track, 13054 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot      Start         End      Blocks   Id  System
/dev/sdb1               1       13054   104856223+  83  Linux

Command (m for help): t (t=type of partition)
Selected partition 1 (Change type of partition 1)
Hex code (type L to list codes): 8e (8e=LVM partition – Type L to list partition code)
Changed system type of partition 1 to 8e (Linux LVM)
Command (m for help): p (p=print partition information)

Disk /dev/sdh: 107.3 GB, 107374510080 bytes
255 heads, 63 sectors/track, 13054 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot      Start         End      Blocks   Id  System
/dev/sdb1               1       13054   104856223+  8e  Linux LVM

Command (m for help): w (w=write partition to disk)

The partition table has been altered!
Calling ioctl() to re-read partition table.
Syncing disks.

If we look again at our /proc/partition, we should see our new disk and partition are being seen by this server.

root@gollum~# grep sd /proc/partitions
8 0 104857920 sda
8 1 104856223 sda1
8 16 15728640 sdb
8 16 15727658 sdb1

Now we need to make sure that the new disk and partition are seen by every servers within the cluster. We now need to go on every servers in the cluster and run the command “partprobe” (Partition probe). After running the command, you should check like we did on “gollum”, if all the disks and partitions are seen by each servers.

root@gollum~# partprobe
root@gandalf~# partprobe
root@bilbo~# partprobe

 

Creating our physical volume

Now that we know that the disk is seen by every node, let’s create the physical volume on one of the server and then check on all others servers in the cluster, the physical should be seen on all the servers. The command to create a physical volume is “pvcreate”, so what we are really doing here, is creating a physical volume of the partition (/dev/sdb1) we created earlier.

# pvcreate /dev/sdb1
Physical volume “/dev/sdb1” successfully created

Let’s run a pvscan on every node, to validate that every node can actually see the new disk.

# pvscan
PV /dev/sda1           VG     datavg      lvm2 [100.00 GB / 22.02 GB free]
PV /dev/sdb1                                         lvm2 [100.00 GB]

 

Create our clustered volume group

We will now create a new volume group named “sharevg” and we will assign the physical volume “/dev/sdb1” as part of that group. If we ever ran out of disk space within “sharevg”, we could add another physical volume to the volume group and continue to work without any service disruption. This is a real advantage when working in production environment.


# vgcreate  sharevg /dev/sdb1

Clustered volume group “sharevg” successfully created

 

Display the “sharevg” volume group properties

We can display the volume group properties by issueing the “vgdisplay” command. We can see that our volume group is “Clustered”, so it is cluster aware. This will allow later on,  to create “LV” (Logical volume/Filesystem) on one server and have the cluster software automatically advise the cluster member that a new logical volume (filesystem) is available.

root@gollum~# vgdisplay sharevg
 --- Volume group ---
 VG Name               sharevg
 System ID            
 Format                lvm2
 Metadata Areas        1
 Metadata Sequence No  25
 VG Access             read/write
 VG Status             resizable
 Clustered             yes
 Shared                no
 MAX LV                0
 Cur LV                0
 Open LV               0
 Max PV                0
 Cur PV                1
 Act PV                1
 VG Size               100.00 GB
 PE Size               4.00 MB
 Total PE              24999
 Alloc PE / Size       1 / 2.10 MB
 Free  PE / Size       24980 / 98.00 GB
 VG UUID               V8Ag76-vdW2-NAk4-JjOo-or3l-GuPz-x5LEKP

 

Create a logical volume of 1024MB named “cadmin” of the “sharevg” volume group

We will create a logical volume named “cadminlv” (Cluster Admin), in our sharevg volume group. The command below is asking to create a logical volume of 1024MB, name “cadminlv” in the volume group “sharevg”.  This command can be done one server and the logical volume will be seen by every member of the cluster.

#/usr/sbin/lvcreate -L1024M -n cadminlv sharevg
Logical volume “cadminlv” created

 

The “lvs” command allow you to display a list of all your logical volumes. Since this is currently the only one on the volume group “sharevg”, we filter the list (with the grep command) to only display the logical volume on “sharevg” volume group.Let’s check if it seen by all nodes.

root@gandalf~# lvs | grep sharevg
cadminlv sharevg -wi-a- 1024.00M

root@gollum~# lvs | grep sharevg
cadminlv sharevg -wi-a- 1024.00M

root@bilbo~# lvs | grep sharevg
cadminlv sharevg -wi-a- 1024.00M

 

Creating the /cadmin GFS Filesystem

Finally, we are going to create a GFS filesystem within that logical volume “cadminlv” we’ve just created. But first we need to create our GFS filesystem mount point on every nodes. We need to do that, because we want this filesystem to be mounted on every node and be available for the 3 nodes.

root@gandalf~# mkdir /cadmin

root@bilbo~#    mkdir /cadmin

root@gollum~# mkdir /cadmin

 

We have choosen to have the GFS filesystem “/cadmin” to be mounted on all servers at all time. We could have include it as part of our service, so it would be mounted only when a service is started. But we found out, that the process of unmounting and mounting a couple of GFS take time and this time adds up to the time it take to move a service from one server to another.  In production we have 5 servers in a cluster for more that 2 years now,  we have around 30 GFS mounted at all time on all the five servers and we had very little problem. The only thing you have to be careful about is the number of journals that you assign to each GFS. One journal is needed for each concurrent mount in the cluster, in our case we have at least 5 journals for each of our GFS filesystem (more on that below).Create the GFS on the LVM cadmin created previously. This need to be done only on one node, the creation is done once and all the nodes are made aware of the new GFS by the cluster daemon.

The command we use to create a GFS filesystem is “gfs_mkfs”.  We need to use a couple of options and I will explain them all.

First, the “-O”  prevents  “gfs_mkfs”  from  asking  for confirmation before creating the filesystem.

The option “-p lock_dlm“, indicate the name of the  locking  protocol to use.  The  locking  protocol should  be  “lock_dlm” for a clustered file system.

The “-t our_cluster:cadminlv” It’s the cluster-name, followed by “:” and the logical volume name.  The cluster name must match the one you have defined in your cluster configuration file (in our case “our_cluster”),  only members  of  this cluster are permitted to use this file system.  The filesystem name (cadminlv)  is a unique file system name used to distinguish this GFS file system from others created (1 to 16  characters).

The “-j 4” is the number of journals for gfs_mkfs to create.  You need at least one  journal  per machine that will mount the filesystem. This number should have been 3, but I always add one more, in case a add a member in the cluster. This number is important, if I had put 3 and I added a node within the cluster and I wanted the 4 nodes to mount simultaneously this filesystem, I would need to make sure that the filesystem have 4 journals, because the GFS wound not mount. You can always add a journal to an existing GFS filesystem with the “gfs_jadd” command. Each journal reserve 128 MB in the filesystem, so you need take into consideration. Let look at our example, we want all our nodes to mount the “/cadmin” GFS filesystem, we created an logical volume of 1024M and on it we created a GFS, we reserved 4 journals (4*128=512MB) , so will have only around 500 MB available for data out of the 1024MB we allocated to our logical volume.

The last parameter “/dev/sharevg/cadmlv” is the name of the logical volume we created previously.

# /sbin/gfs_mkfs -O -p lock_dlm -t our_cluster:cadmlv -j 4 /dev/sharevg/cadmlv
Device:                    /dev/sharevg/cadmlv
Blocksize:                 4096
Filesystem Size:           98260
Journals:                  4
Resource Groups:           8
Locking Protocol:          lock_dlm
Lock Table:                our_cluster:cadminlv
Syncing…
All Done

We are now able to mount our GFS filesystem on all the servers, by using the command below on all the servers,

# mount -t gfs /dev/sharevg/cadminlv  /cadmin

We want that filesystem to be mounted every time a server boot, so don’t forget to add your filesystem to /etc/fstab, so it will mount after the next reboot and don’t forget to change the owner and protection of the filesystem.

# echo “/dev/sharevg/cadminlv /cadmin  gfs defaults 0 0″ >> /etc/fstab

The filesystem should be available on all our nodes.

# df -h /cadmin
Filesystem                                                  Size     Used    Avail    Use%  Mounted on
/dev/mapper/sharevg-cadminlv  510 M 1.5M   508M   1%      /cadmin

 

So we’ve just created our first GFS filesystem and it is mounted on all our nodes in the cluster.

In our last article we will finalise our cluster, by creating the needed scripts for our ftp/web services to start and to move from server to server. We will add these scripts to our cluster configuration and we will show you how to move service from one server to another using the command line and GUI. So stay tune, for this last article on how to build a Red Hat cluster.

 

Part 1 – Creating a Linux ReadHat/CentOS cluster

Part 2 – Creating a Linux ReadHat/CentOS cluster

Part 3 – Creating a Linux ReadHat/CentOS cluster

Part 4 – Creating a Linux ReadHat/CentOS cluster

Part 5 – Creating a Linux ReadHat/CentOS cluster

 

Categories: Cluster