Thursday, October 24, 2013

NFS Active Passive using DRBD Heartbeat Cent OS 5




NFS Active Passive using DRBD Heartbeat Cent OS 5

yum repos used - epel , rpmforge and

wget -O /etc/yum.repos.d/pacemaker.repo http://clusterlabs.org/rpm/epel-5/clusterlabs.repo

VIP - 192.168.1.70  -- see in haresources file below (Heartbeat section)

[root@a-log1]cat /etc/hosts

192.168.1.71    a-log1
192.168.1.72    a-log2

DRBD setup


[root@a-log1 ~]# rpm -qa | grep drbd
kmod-drbd83-8.3.15-3.el5.centos
drbd83-8.3.15-2.el5.centos

config file 

[root@a-log1 ~]# cat /etc/drbd.conf
#
# please have a a look at the example configuration file in
# /usr/share/doc/drbd83/drbd.conf
#

include "/etc/drbd.d/global_common.conf";
include "/etc/drbd.d/*.res";

[root@a-log1 ~]# cat /etc/drbd.d/res0.res
resource r0 {
  protocol C;
  startup { wfc-timeout 0; degr-wfc-timeout     30; }
  disk { on-io-error detach; no-disk-barrier; no-disk-flushes; no-md-flushes; no-disk-drain; }
  net {  cram-hmac-alg "sha1"; shared-secret "nit3fl!rtZ"; sndbuf-size 512k; rcvbuf-size 512k; max-epoch-size 8000; max-buffers 8000; unplug-watermark 16;}
  syncer { rate 180M; al-extents 3389; }

  device    /dev/drbd1;
  disk      /dev/sdb1;
  meta-disk internal;
  on a-log1 {
    address   192.168.1.71:7789;
  }
  on a-log2 {
    address   192.168.1.72:7789;
  }
}

Let us prepare raw sdb on a-log1 and a-log2 servers i,e 
Now we have to create the metadata on both servers:

a-log1# drbdadm create-md r0

a-log2#drbdadm create-md r0

and we can now start DRBD on both servers at the same time:



[root@a-log1]# service drbd start

Starting DRBD resources: [ d(main) s(main) n(main) ].


[root@a-log2]# service drbd start

Starting DRBD resources: [ d(main) s(main) n(main) ].


You have two ways that I know to verify that DRBD is running properly:



# service drbd status



# cat /proc/drbd

As you can see they say that it is connected but ro is Secondary/Secondary meaning that we haven’t told the system which one is the Primary server (master) that contains the block to be replicated. Once we tell the system who is the master it will start the synchronization.

Primary server -- Study drbd webpage - you can hardcode in config
a-log1 # drbdsetup /dev/drbd1 primary –o

The synchronization started and it will take a little while to be completed. Please wait until it is done and move to the next step.



0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----

ns:41941724 nr:0 dw:0 dr:41942388 al:0 bm:2560 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0





Now that our servers are in sync we can create format our /dev/drbd0 with our preferred file system. In my case I will use ext4.

On primary a-log1:



a-log1# mkfs.ext4 /dev/drbd1

Configuring NFS exports for Heartbeat integration

Okay, now we have our DRBD up and running. Great!! Next step is to set up out NFS export and then install heartbeat so we can have an automatic fail over system.

Lets prepare our NFS first:

On both servers:

# mount /dev/drbd1 /data    # We will mount it

# mkdir /data/main           # Create it to store data        


 [root@a-log1 ~]# cat /etc/exports
/data/main 192.168.1.0/255.255.255.0(rw,sync,no_root_squash,fsid=11280987)

 [root@a-log1 ~]# ls -l /data/
total 36
drwx------  2 root root 16384 Oct 23 13:09 lost+found
drwxr-xr-x 14 root root 12288 Oct 23 21:55 main
drwxr-xr-x  5 root root  4096 Oct 24 22:22 nfs
The above nfs came like this....

[root@a-log2 ~]# cat /etc/exports
/data/main 192.168.1.0/255.255.255.0(rw,sync,no_root_squash,fsid=11280987)

 NFS stores some information about your NFS mounts at /var/lib/nfs and since those information will have to be mirrored we will have to move them to the DRBD device:

 # mv /var/lib/nfs/ /data/

this might generate an error such as:

mv: cannot remove `/var/lib/nfs/rpc_pipefs/cache': Operation not permitted

mv: cannot remove `/var/lib/nfs/rpc_pipefs/nfsd4_cb': Operation not permitted

mv: cannot remove `/var/lib/nfs/rpc_pipefs/statd': Operation not permitted

mv: cannot remove `/var/lib/nfs/rpc_pipefs/portmap': Operation not permitted

mv: cannot remove `/var/lib/nfs/rpc_pipefs/nfs': Operation not permitted

mv: cannot remove `/var/lib/nfs/rpc_pipefs/mount': Operation not permitted

mv: cannot remove `/var/lib/nfs/rpc_pipefs/lockd': Operation not permitted

but do not worry about it because it will create the directories anyways.


# mv /var/lib/nfs /var/lib/nfsBackup


Then symlink /var/lib/nfs to our /data directory:


# ln -s /data/nfs/ /var/lib/nfs

# umount /data


On server2 only:  a-log2

# mv /var/lib/nfs/ /var/lib/nfsBackup
# mkdir /data

# ln -s /data/nfs/ /var/lib/nfs


The symbolic link will be broken since the /dev/drbd1 is not mounted. This will work in case of NFS fail-over.

Heartbeat installation and configuration

 [root@a-log1 ~]# rpm -qa | grep heart
heartbeat-3.0.3-2.3.el5
heartbeat-libs-3.0.3-2.3.el5

[root@a-log1 ~]# rpm -qa | grep pace
pacemaker-1.0.12-1.el5.centos

 [root@a-log1 ha.d]# cat ha.cf
debugfile /var/log/ha-debug
logfile /var/log/ha-log
logfacility local0
keepalive 1
warntime 5
deadtime 10
initdead 30
bcast eth0
ping 192.168.1.1
node    a-log1
node    a-log2

[root@a-log1 ha.d]# cat authkeys 
 auth 1
1 sha1 n11f

[root@a-log1 ha.d]# cat haresources
a-log1 IPaddr::192.168.1.70/24/eth0:0 drbddisk::r0 Filesystem::/dev/drbd1::/data::ext4 nfslock nfs

Passive server a-log2

[root@a-log2 ~]# cat /etc/ha.d/ha.cf
debugfile /var/log/ha-debug
logfile /var/log/ha-log
logfacility local0
keepalive 1
warntime 5
deadtime 10
initdead 30
bcast eth0
ping 192.168.1.1
node    a-log1
node    a-log2
[root@a-log2 ha.d]# cat authkeys 

 auth 1
1 sha1 n11f



 [root@a-log2 ~]# cat /etc/ha.d/haresources
a-log1 IPaddr::192.168.1.70/24/eth0:0 drbddisk::r0 Filesystem::/dev/drbd1::/data::ext4 nfslock nfs

Final Important - Using nfs version 4
[root@a-log2 ~]# ps aux | grep nfs
root      9975  0.0  0.0      0     0 ?        S<   23:05   0:00 [nfsd4]
root      9976  0.0  0.0      0     0 ?        S    23:05   0:00 [nfsd]
root      9977  0.0  0.0      0     0 ?        S    23:05   0:00 [nfsd]
root      9978  0.0  0.0      0     0 ?        S    23:05   0:00 [nfsd]
root      9979  0.0  0.0      0     0 ?        S    23:05   0:00 [nfsd]
Both servers - open # vim /etc/init.d/nfs
        #killproc nfsd -2  - comment and change to
        killproc nfsd -9  

 Test

 Primary a-log1 

 [root@a-log1 ~]# /etc/init.d/heartbeat stop
Stopping High-Availability services:                       [  OK  ]


Check a-log2 server  -  VIP failover is successfull

[root@a-log2 ~]# ip addr
1: lo: mtu 16436 qdisc noqueue
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: mtu 1500 qdisc pfifo_fast qlen 1000
    link/ether 08:00:27:4a:db:2f brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.72/24 brd 192.168.1.255 scope global eth0
    inet 192.168.1.70/24 brd 192.168.1.255 scope global secondary eth0:0
    inet6 fe80::a00:27ff:fe4a:db2f/64 scope link
       valid_lft forever preferred_lft forever
3: sit0: mtu 1480 qdisc noop
    link/sit 0.0.0.0 brd 0.0.0.0

In Less than 10 seconds - NFS services will be failover from a-log1 to a-log2 server


 Added a NFS client  a5-web server

 [root@a5-web ~]# cat /etc/fstab
192.168.1.70:/data/main /mnt/storage nfs noatime,nodev 0 0

Perform continuous write's
 [root@a5-web ~]# for i in `seq 1 10000`;do echo "hello how r u" >> /mnt/storage/ashok/nf.txt;sleep 1;done

Perform continuous read's
 [root@a5-web ~]# for i in `seq 1 10000`;do cat /mnt/storage/ashok/nf.txt | wc -l;sleep 2 ; done
13175
In primary server - you can stop heartbeat  and watch what happens in a5-web nfs client. 
 Later, start heartbeat and watch everything works as expected ;)

Another test - Download a file.

 [root@a5-web ~]# wget --no-cookies  --no-check-certificate --header "Cookie: gpw_e24=http%3A%2F%2Fedelivery.oracle.com" "http://download.oracle.com/otn-pub/java/jdk/6u38-b05/jdk-6u38-linux-i586.bin"

 Try coping this file to /mnt/storage/  

While it is copying - stop heartbeat in primary server ...  it should continue to copy and finish!

Check md5sum of both files.

 [root@a5-web ~]# md5sum /mnt/storage/ashok/jdk-6u38-linux-i586.bin
5bae3dc304d32a7e3c50240eab154e24  /mnt/storage/ashok/jdk-6u38-linux-i586.bin
[root@a5-web ~]# md5sum jdk-6u38-linux-i586.bin
5bae3dc304d32a7e3c50240eab154e24  jdk-6u38-linux-i586.bin
[root@a5-web ~]# 


Ofcourse - please watch magic happening in logs in both servers ;)

[root@a-log1 ~]# tail -f /var/log/messages /var/log/ha-*
[root@a-log2 ~]# tail -f /var/log/messages /var/log/ha-*


Cool :)

reference -
http://www.linux-ha.org/wiki/Haresources
 http://www.gossamer-threads.com/lists/linuxha/dev/59880
http://www.linuxnix.com/2010/01/heartbeat-clustering.html

rsync with delete option and different ssh port

How to rsync e.g PIPELINE dir from Source to Destination? #rsync -avzr   --delete-before  -e "ssh -p $portNumber"  /local...