Creating a Red Hat Cluster: Part 4
Welcome back to LINternUX, where we continue the creation of our cluster. By now you should have a working cluster running an ftp service and a web service. Although the service are created, our ftp and web service are not really running yet. In this article we will create a GFS filesystem that will allows us to share data between nodes. In the next and last article we’ll finalise the cluster by completing our ftp and web service so they really work. We will also show you how to manually move service from one server to another. So we still have some work to do, so let’s start right away.
Adding a SAN disk to our servers
The Linux operating system is installed on the internal disks for each of our server. We will now add a SAN disk that will be visible be each of our server. I assume here that your SAN and your Brocade switch are configure accordingly. Explaining how to set up the SAN and the Brocade switch is not in the scope of this article. But I think that you get the idea that the new disk must be visible by every node in our cluster. In the example below we already have a SAN disk (sda) with one partition (sda1) on it. Adding a disk to the server, can be done (live) without any interruption of service, if you follow the steps below. I would suggest you practice on a test server, to become familiar with the procedure.
Before we add a disk, let’s see what are the visible disks on the system, by looking at the /proc/partitions file. We can see that we already have a disk (sda) with one partition on it. So the new disk that we’re going to add, should be seen as “sdb”.
root@gollum~# grep sd /proc/partitions
8 0 104857920 sda
8 1 104856223 sda1
Let’s rescan the SCSI bus by typing the command below. This command must be run on each of the server within the cluster. Here, we have only one HBA (Host Base Adapter) card connected to the SAN on each server. If you have a second HBA, you need to run the same command for the second HBA, but replace the “host0″ by “host1″.
root@gollum~# echo “- – -” > /sys/class/scsi_host/host0/scan
root@gandalf~# echo “- – -” > /sys/class/scsi_host/host0/scan
root@bilbo~# echo “- – -” > /sys/class/scsi_host/host0/scan
Let’s see if we have some new disk(s) that were detected (sdb) (check each servers)
root@gollum~# grep sd /proc/partitions
8 0 104857920 sda
8 1 104856223 sda1
8 16 15728640 sdb
Let’s create a LVM partition on our new disk (sdb) by running the “fdisk” command..
# fdisk /dev/sdb
Command (m for help): p
Disk /dev/sdh: 107.3 GB, 107374510080 bytes
255 heads, 63 sectors/track, 13054 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id SystemCommand (m for help): n (n=new partition)
Command action
e extended
p primary partition (1-4) p (p=primary partition)
Partition number (1-4): 1 (first partition =1)
First cylinder (1-13054, default 1): 1 (Start at the beginning of disk)
Last cylinder or +size or +sizeM or +sizeK (1-13054, default 13054): 13054 (End of the Disk)
Command (m for help): p (p=Print partition information)Disk /dev/sdh: 107.3 GB, 107374510080 bytes
255 heads, 63 sectors/track, 13054 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sdb1 1 13054 104856223+ 83 LinuxCommand (m for help): t (t=type of partition)
Selected partition 1 (Change type of partition 1)
Hex code (type L to list codes): 8e (8e=LVM partition – Type L to list partition code)
Changed system type of partition 1 to 8e (Linux LVM)
Command (m for help): p (p=print partition information)
Disk /dev/sdh: 107.3 GB, 107374510080 bytes
255 heads, 63 sectors/track, 13054 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sdb1 1 13054 104856223+ 8e Linux LVMCommand (m for help): w (w=write partition to disk)
The partition table has been altered!
Calling ioctl() to re-read partition table.
Syncing disks.
If we look again at our /proc/partition, we should see our new disk and partition are being seen by this server.
root@gollum~# grep sd /proc/partitions
8 0 104857920 sda
8 1 104856223 sda1
8 16 15728640 sdb
8 16 15727658 sdb1
Now we need to make sure that the new disk and partition are seen by every servers within the cluster. We now need to go on every servers in the cluster and run the command “partprobe” (Partition probe). After running the command, you should check like we did on “gollum”, if all the disks and partitions are seen by each servers.
root@gollum~# partprobe
root@gandalf~# partprobe
root@bilbo~# partprobe
Creating our physical volume
Now that we know that the disk is seen by every node, let’s create the physical volume on one of the server and then check on all others servers in the cluster, the physical should be seen on all the servers. The command to create a physical volume is “pvcreate”, so what we are really doing here, is creating a physical volume of the partition (/dev/sdb1) we created earlier.
# pvcreate /dev/sdb1
Physical volume “/dev/sdb1″ successfully created
Let’s run a pvscan on every node, to validate that every node can actually see the new disk.
# pvscan
PV /dev/sda1 VG datavg lvm2 [100.00 GB / 22.02 GB free]
PV /dev/sdb1 lvm2 [100.00 GB]
Create our clustered volume group
We will now create a new volume group named “sharevg” and we will assign the physical volume “/dev/sdb1″ as part of that group. If we ever ran out of disk space within “sharevg”, we could add another physical volume to the volume group and continue to work without any service disruption. This is a real advantage when working in production environment.
# vgcreate sharevg /dev/sdb1
Clustered volume group “sharevg” successfully created
Display the “sharevg” volume group properties
We can display the volume group properties by issueing the “vgdisplay” command. We can see that our volume group is “Clustered”, so it is cluster aware. This will allow later on, to create “LV” (Logical volume/Filesystem) on one server and have the cluster software automatically advise the cluster member that a new logical volume (filesystem) is available.
root@gollum~# vgdisplay sharevg --- Volume group --- VG Name sharevg System ID Format lvm2 Metadata Areas 1 Metadata Sequence No 25 VG Access read/write VG Status resizable Clustered yes Shared no MAX LV 0 Cur LV 0 Open LV 0 Max PV 0 Cur PV 1 Act PV 1 VG Size 100.00 GB PE Size 4.00 MB Total PE 24999 Alloc PE / Size 1 / 2.10 MB Free PE / Size 24980 / 98.00 GB VG UUID V8Ag76-vdW2-NAk4-JjOo-or3l-GuPz-x5LEKP
Create a logical volume of 1024MB named “cadmin” of the “sharevg” volume group
We will create a logical volume named “cadminlv” (Cluster Admin), in our sharevg volume group. The command below is asking to create a logical volume of 1024MB, name “cadminlv” in the volume group “sharevg”. This command can be done one server and the logical volume will be seen by every member of the cluster.
#/usr/sbin/lvcreate -L1024M -n cadminlv sharevg
Logical volume “cadminlv” created
The “lvs” command allow you to display a list of all your logical volumes. Since this is currently the only one on the volume group “sharevg”, we filter the list (with the grep command) to only display the logical volume on “sharevg” volume group.Let’s check if it seen by all nodes.
root@gandalf~# lvs | grep sharevg
cadminlv sharevg -wi-a- 1024.00Mroot@gollum~# lvs | grep sharevg
cadminlv sharevg -wi-a- 1024.00Mroot@bilbo~# lvs | grep sharevg
cadminlv sharevg -wi-a- 1024.00M
Creating the /cadmin GFS Filesystem
Finally, we are going to create a GFS filesystem within that logical volume “cadminlv” we’ve just created. But first we need to create our GFS filesystem mount point on every nodes. We need to do that, because we want this filesystem to be mounted on every node and be available for the 3 nodes.
root@gandalf~# mkdir /cadmin
root@bilbo~# mkdir /cadmin
root@gollum~# mkdir /cadmin
We have choosen to have the GFS filesystem “/cadmin” to be mounted on all servers at all time. We could have include it as part of our service, so it would be mounted only when a service is started. But we found out, that the process of unmounting and mounting a couple of GFS take time and this time adds up to the time it take to move a service from one server to another. In production we have 5 servers in a cluster for more that 2 years now, we have around 30 GFS mounted at all time on all the five servers and we had very little problem. The only thing you have to be careful about is the number of journals that you assign to each GFS. One journal is needed for each concurrent mount in the cluster, in our case we have at least 5 journals for each of our GFS filesystem (more on that below).Create the GFS on the LVM cadmin created previously. This need to be done only on one node, the creation is done once and all the nodes are made aware of the new GFS by the cluster daemon.
The command we use to create a GFS filesystem is “gfs_mkfs”. We need to use a couple of options and I will explain them all.
First, the “-O“ prevents “gfs_mkfs” from asking for confirmation before creating the filesystem.
The option “-p lock_dlm“, indicate the name of the locking protocol to use. The locking protocol should be “lock_dlm” for a clustered file system.
The “-t our_cluster:cadminlv” It’s the cluster-name, followed by “:” and the logical volume name. The cluster name must match the one you have defined in your cluster configuration file (in our case “our_cluster”), only members of this cluster are permitted to use this file system. The filesystem name (cadminlv) is a unique file system name used to distinguish this GFS file system from others created (1 to 16 characters).
The “-j 4” is the number of journals for gfs_mkfs to create. You need at least one journal per machine that will mount the filesystem. This number should have been 3, but I always add one more, in case a add a member in the cluster. This number is important, if I had put 3 and I added a node within the cluster and I wanted the 4 nodes to mount simultaneously this filesystem, I would need to make sure that the filesystem have 4 journals, because the GFS wound not mount. You can always add a journal to an existing GFS filesystem with the “gfs_jadd” command. Each journal reserve 128 MB in the filesystem, so you need take into consideration. Let look at our example, we want all our nodes to mount the “/cadmin” GFS filesystem, we created an logical volume of 1024M and on it we created a GFS, we reserved 4 journals (4*128=512MB) , so will have only around 500 MB available for data out of the 1024MB we allocated to our logical volume.
The last parameter “/dev/sharevg/cadmlv” is the name of the logical volume we created previously.
# /sbin/gfs_mkfs -O -p lock_dlm -t our_cluster:cadmlv -j 4 /dev/sharevg/cadmlv
Device: /dev/sharevg/cadmlv
Blocksize: 4096
Filesystem Size: 98260
Journals: 4
Resource Groups: 8
Locking Protocol: lock_dlm
Lock Table: our_cluster:cadminlv
Syncing…
All Done
We are now able to mount our GFS filesystem on all the servers, by using the command below on all the servers,
# mount -t gfs /dev/sharevg/cadminlv /cadmin
We want that filesystem to be mounted every time a server boot, so don’t forget to add your filesystem to /etc/fstab, so it will mount after the next reboot and don’t forget to change the owner and protection of the filesystem.
# echo “/dev/sharevg/cadminlv /cadmin gfs defaults 0 0″ >> /etc/fstab
The filesystem should be available on all our nodes.
# df -h /cadmin
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/sharevg-cadminlv 510 M 1.5M 508M 1% /cadmin
So we’ve just created our first GFS filesystem and it is mounted on all our nodes in the cluster.
In our last article we will finalise our cluster, by creating the needed scripts for our ftp/web services to start and to move from server to server. We will add these scripts to our cluster configuration and we will show you how to move service from one server to another using the command line and GUI. So stay tune, for this last article on how to build a Red Hat cluster.
Part 1 – Creating a Linux ReadHat/CentOS cluster
Part 2 – Creating a Linux ReadHat/CentOS cluster
Part 3 – Creating a Linux ReadHat/CentOS cluster
Part 4 – Creating a Linux ReadHat/CentOS cluster
Part 5 – Creating a Linux ReadHat/CentOS cluster

Recent Comments