#DRBD based disk replication of a production cluster to a remote site cluster on RHEL 6

By | December 9, 2015

1 General Considerations

Our example enterprise applications run on a Linux Cluster with a shared cluster storage resource. Having this HA setup ensures that we have a high rate of service availability on the production site.
To ensure that the disruption time in service of our enterprise application is minimal as possible the best solution is to have an identical environment on a remote site. The sites will be kept in a mirror state by using site to site data replication using DRBD. To ensure that the data replication is not decreasing the performance of the application systems even if there is a distance between sites the replication of data is done in an asynchronous way using DRBD replication.

Due to the fact that our applications use exclusively the database as the storage place for all the data we will have to replicate only the database related files, that are stored on both cluster sites under the logical volumes /var/vg_data/lv_data. The same configuration can be done also if we want to replicate also the application files or an entire virtual appliance.

2 DRBD configuration

2.1 Prerequisites

There are several prerequisites that must be met before replication is configured between the sites:

1. Install the drbd and kmod-drbd packages. Make sure the latest version for RedHat Enterprise Linux 6 are installed.
2. Make sure the routes are added such as the servers from the two sites are visible to each other.
3. Make sure the logical volumes /var/vg_data/lv_data are correctly created on both production and remote sites. To be correctly created one must ensure that only 2/3 rds of the vg_data volume group space is allocated to the lv_data logical volume. The rest of the vg_data space will be used by the drdb mechanism of replication to store temporary replication data.
4. Make a backup of the data from the /data directory (the directory where /var/vg_data/lv_data is mounted )
5. Remove the logical volume lv_data on both production and remote
6. Recreate the logical volume lv_dat on both production and remote

7. Activate the logical volume lv_data

3 Replication configurations on server nodes

There are several steps that must be performed to correctly configure the replication on both sites.

Make sure that on both sites, on both nodes (on 4 servers in total two on each site), the /etc/drbd.conf looks like the following:
# You can find an example in /usr/share/doc/drbd…/drbd.conf.example

Make sure that on both sites, on both nodes (on 4 servers in total two on each site), the /etc/drbd.d/global_common.conf looks like the following:

Make sure that on both sites, on both nodes (on 4 servers in total two on each site), the /etc/drbd.d/repdata.res looks like the following:

Change on both nodes on each site /usr/lib/drbd/snapshot-resync-target-lvm.sh to include lvm tagging for the snapshot volume created during resync. The changed line is :

The above defines the replication resource repdata. Note that on both sites the device /dev/drbd0 will be created over the logical volume /dev/vg_data/lv_data. There are two floating IP resources added to this replication resource, which represent the end points of the replication. The IPs used by the repdata resource are the database cluster IPs from the PR and DR sites.

4 Replication configurations on clusters

There are certain configurations changes that are one to the clusters configuration in order to accommodate the replication of data between sites.

4.1 Replication Global Resources on production cluster

Add a resource of type “Script” with the properties:
Name DRBDSlave
Full path to script file: /etc/init.d/drbd

Add a resource of type “Script” with the properties:
Name DRBDMaster
Full path to script file: /etc/init.d/master

The above scripts are from the drbd and master are in fact the same script from drbd resources:

4.2 Replication Service Resources on cluster

Go to “Service Groups” and add a new service with the following properties:
Service Name: DRBD_Slave_Service
Automatically Start This Service: off
Failover Domain: DATABASE
Recovery Policy: Relocate
Add in place the following resources:

Replication cluster service virtual IP
To add the resource to the cluster go to add resource menu and add the resource as an “IP Address” resource type with properties:
IP Address: 172.20.101.3/28
Netmask Bits: 28
Monitor Link: on
Number of seconds to sleep after removing an IP address: 10

As a child resource add the following resource:

Replication data logical volume
An DBDataLVM_Slave resource of type HA LVM will be added having the following properties:
Volume Group Name: vg_data
Logical Volume Name: lv_data

As a child resource add the global resource DRBDSlave.


4.3 Replication related changes to the Database Cluster Service on production cluster

To accommodate site replication the Database_Service cluster service (see Service Groups) must be changed to include the DRBD scripts.

Delete the Database_Service cluster service and recreate it as following:

Go to “Service Groups” and add a new service with the following properties:
Service Name: Database_Service
Automatically Start This Service: on
Failover Domain: DATABASE
Recovery Policy: Relocate

Make sure to Submit changes.

Add in the following order the following resources to the service by using “Add Resource” button. The order in which the resources are added is very important, this is the order in which they are initialized at service start. When the service is stopped the resources are stopped in the reverse order. There are dependencies between resources so first we have to add the resources which are independent and the the ones which depend on availability of other resources.

1. Add “172.20.101.19” resource
2. Add as a child resource of the previous resource “DBLogLVM” resource
3. Add as a child resource of the previous resource “DBLog” resource
4. Add as a child resource of the previous resource “DBDataLVM” resource
5. Add as a child resource of the previous resource “DRBDMaster” resource
6. Add as a child resource of the previous resource “DBData” resource
7. Add as a child resource of the previous resource “DB11g” resource

4.4 Replication Global Resources on remote cluster

Add a resource of type “Script” with the properties:
Name DRBDSlave
Full path to script file: /etc/init.d/drbd

Add a resource of type “Script” with the properties:
Name DRBDMaster
Full path to script file: /etc/init.d/master

4.5 Replication Service Resources on remote cluster

Go to “Service Groups” and add a new service with the following properties:
Service Name: DRBD_Slave_Service
Automatically Start This Service: off
Failover Domain: DATABASE
Recovery Policy: Relocate
Add in place the following resources:

Replication cluster service virtual IP
To add the resource to the cluster go to add resource menu and add the resource as an “IP Address” resource type with properties:
IP Address: 172.20.101.19/28
Netmask Bits: 28
Monitor Link: on
Number of seconds to sleep after removing an IP address: 10
As a child resource add the following resource:
Replication data logical volume
An DBDataLVM_Slave resource of type HA LVM will be added having the following properties:
Volume Group Name: vg_data
Logical Volume Name: lv_data

As a child resource add the global resource DRBDSlave.

4.6 Replication related changes to the Database Cluster Service on remote cluster

To accommodate site replication the Database_Service cluster service (see Service Groups) must be changed to include the DRBD scripts.

Delete the Database_Service cluster service and recreate it as following:

Go to “Service Groups” and add a new service with the following properties:
Service Name: Database_Service
Automatically Start This Service: on
Failover Domain: DATABASE
Recovery Policy: Relocate

Make sure to Submit changes.

Add in the following order the following resources to the service by using “Add Resource” button. The order in which the resources are added is very important, this is the order in which they are initialized at service start. When the service is stopped the resources are stopped in the reverse order. There are dependencies between resources so first we have to add the resources which are independent and the the ones which depend on availability of other resources.

8. Add “172.20.101.3” resource
9. Add as a child resource of the previous resource “DBLogLVM” resource
10. Add as a child resource of the previous resource “DBLog” resource
11. Add as a child resource of the previous resource “DBDataLVM” resource
12. Add as a child resource of the previous resource “DRBDMaster” resource
13. Add as a child resource of the previous resource “DBData” resource
14. Add as a child resource of the previous resource “DB11g” resource

5 Replication General Set-up

5.1 Synchronizing the sites for the first time

First synchronization must be done in a different way than the final configuration.

Add by hand the production database cluster IP as a resource on first cluster node on production and remote database cluster IP as a resource on first cluster node on remote.

Initialize the meta-data area on disk before starting drbd on both production and remote site. Note that, due to the fact the replication resource repdata is a cluster resource, this operation must be done only on one node of the production and remote clusters.
On both sites execute the following command:

Start drbd on both first cluster nodes (on which the database cluster IPs were bonded) from production and remote.

As you can see , both nodes are secondary, which is normal. we need to decide which node will act as a primary node.To do this we have to initiate the first ‘full sync’ between the two nodes.

On node one on cluster execute the following

Wait for the first site synchronization to be done by monitoring the situation on the above node with commanda:

Note that it may take some time as the first time the whole 100G of the lv_data logical volume are going to be replicated from PR to DR sites.

After the synchronization between sites is done, we can now format /dev/drbd0 the new drbd device.

Mount the drbd device under the /data folder and then copy the backup data of old /data folder into it.

5.2 Replication from live site to the standby site

After the initial site to site synchronization was done we are ready to start the normal replication operations.

On the standby site by starting the DRBD_Slave_Service implicitly the DRBD of the standby site is set-up as the slave DRBD that generates data

On the live site by starting the Database_Service implicitly the DRBD of the live site is set-up as the master DRBD that generates data

It is important to note that always the slave site must be started first and then the master site.

6 Switching from one site to another

When doing a site switch from the live site to the standby site the following steps must be performed:
1. Stop the Application_Service on the live site
2. Check that the sites are in sync. If not wait for the standby site to receive all the replication data.
This can be checked by executing service drbd status on the standby site
3. Stop the Database_Service on the live site
4. Stop the DRBD_Slave_Service on the standby site
5. Start the DRBD_Slave_Service on the live site
6. Start the Database_Service on the standby site
7. Start the Application_Service on the standby site
8. Check that the sites are in sync.
This can be checked by executing service drbd status on the standby site and live site.

The site switch is now complete.

7 Replication Monitoring

At any time the replication status can be easily monitored by running the following command on both sites:

8 Promoting the DR site as the main site without stopping the PR site
In some situations we have to promote the DR site as the main site without performing a nice site switch as described in paragraph 6
The only situations when this must be performed is in the following cases:
1. The live site experienced a total failure.
2. The live site is completely isolated from the participants.

The following steps must be performed:

1. Ensure that the conditions from the above paragraph are met.
2. Start the Database_Service on the standby site
3. Start the Application_Service on the standby site

9 DRBD split brain recovery

As a result of the DR promotion as the main site as instructed in paragraph 8 after the live site is recovered the so called “split brain” condition will be experienced by the replication mechanism.

DRBD detects split brain at the time connectivity between production and remote sites becomes available again and the nodes exchange the initial DRBD protocol handshake. If DRBD detects that both nodes are (or were at some point, while disconnected) in the primary role, it immediately tears down the replication connection. The tell-tale sign of this is a message like the following appearing in the system log:
Split-Brain detected, dropping connection!

After split brain has been detected, one node will always have the resource in a StandAlone connection state. The other might either also be in the StandAlone state (if both nodes detected the split brain simultaneously), or in WFConnection (if the peer tore down the connection before the other node had a chance to detect split brain).
We are selecting PR site as the node whose modifications will be discarded (this node is referred to as the split brain victim).
The split brain victim needs to be in the connection state of StandAlone or the following commands will return an error. You can ensure it is standalone by issuing on the PR site:

Also on the PR site execute the following command to force the site as the secondary replication site (the site receiving the replicated data)

On the remote site (the split brain survivor), if its connection state is also StandAlone, you would enter:

You may omit this step if the node is already in the WFConnection state; it will then reconnect automatically.
Upon connection, your split brain victim (PR site) immediately changes its connection state to SyncTarget, and has its modifications overwritten by the remaining primary node. After re-synchronization has completed, the split brain is considered resolved and the two nodes form a fully consistent, redundant replicated storage system again.
At any time the replication status can be easily monitored by running the following command on both sites:

Advertisements

One thought on “#DRBD based disk replication of a production cluster to a remote site cluster on RHEL 6

  1. Pingback: #DRBD: Synchronization of sites after the failed DR (disaster recovery) site is recovered – blog.voina.org