Validate and Investigate a live #DRBD replicated Oracle Instance using LVM clone

By | October 19, 2017

This is another post related to the DRBD replication setup I describe in post #DRBD based disk replication of a production cluster to a remote site cluster on RHEL 6.

Sometimes we need to be able to mount the offline replicated Oracle database instance for investigations. This can be done by using logical volume snapshots.

Follow the following procedure and execute it on the site that acts as disaster recovery site.

On the cluster node where the DRBD_Slave_Service is running freeze the cluster service by executing:

# clusvcadm -Z DRBD_Slave_Service

The above will ensure that all the applications associated with DRBD_Slave_Service are running but the cluster will no longer monitor this service.

Make a back-up of /etc/lvm/lvm.conf

#cp /etc/lvm/lvm.conf /etc/lvm/lvm.conf.bkp 

Edit the lvm.conf and add under the volume_list directive the “vg_data” volume.

#vi /etc/lvm/lvm.conf

Scan the volumes

# vgscan

Create a snapshot of the lv_data volume where the database files are stored.

# lvcreate -s -L 20G -n snapdb /dev/vg_data/lv_data

The snapdb together with lv_data will ensure that we will have a snapshot of the logical volume lv_data at the moment the above command was taken. Under snapdb logical volume only the changes from lv_data will be stored.

Note that we are able to create the snapshot because we have available space in /dev/vg_data volume because of the space we left for the hidden drbd replication logical volume. Make sure not to exceed that free space when creting the snapshot (that is why we use 20GB).

At this point we can mount snapdb snapshot as the database directory.

# mount /dev/vg_data/snapdb /data

Then we start the Oracle listener:

# su - oracle
# lsnrctl start

Also as oracle user copy the control file to the flash_recovery area, we need to do this because the flash_recovery_area is not on the replicated drive.

# cp /data/oradata/prod/control01.ctl /home/oracle/app/oracle/flash_recovery_area/prod/control02.ctl

Then start the Oracle database:

# sqlplus / as sysdba
# startup

When the startup procedure is finished the Oracle instance is opened for investigation using sqldeveloper or we can even start a client application.

After investigation is done we must do a clean-up. Note that any changes done in the database are not affecting our “real” database from lv_data, only the snapdb copy

Stop Oracle by executing:

# su – oracle 
# sqlplus / as sysdba
# shutdown

Then also as oracle user stop the Oracle Listener:

# lsnrctl stop

Then as root user unmount the data directory:

# umount /data

Remove the temporary snapdb logical volume:

#lvremove /dev/vg_data/snapdb

Copy back the lvm.conf back-up:

# cp /etc/lvm/lvm.conf.bkp /etc/lvm/lvm.conf

Back-up the initrd image of the current kernel in case the new version has an unexpected problem.

# cp /boot/initrd-$(uname -r).img /boot/initrd-$(uname -r).img.$(date +%m-%d-%H%M%S).bak

Now rebuild the initramfs for the current kernel version:

# dracut -f -v

Unfreeze the DRBD_Slave_Service cluster service

# clusvcadm -U DRBD_Slave_Service

Note that during the above operations the replication between the live database and the disaster database is still going on without any interruption.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.