Using AIX VG mirroring in combination with hardware snapshots
Using AIX VG mirroring in combination with hardware snapshots
One of the great things about Logical Volume Managers is how you can use them for all manner of clever solutions. I recently explored how to use a combination of hardware snapshots and LVM to create rapid backups without using backup software (or as a source for a data protection product).
To do this we need to do the following:
- We need to present a staging disk to the host, large enough to hold the data we are trying to protect. In this example a volume group (VG) being used to hold DB2 data. This disk could come from a different primary storage device (i.e. an XIV or a Storwize V7000) or could be an Actifio presented disk. You need to check whether your multi-pathing software will work with that disk.
- We mirror our datavg onto our new staging disk using AIX VG mirroring.
- We take a hardware snapshot of that disk.
- We now allow the VG mirror to become stale to remove disk load on the host
- Prior to taking the next snapshot, we get the mirrors back in sync again.
This process clearly depends on whether you would prefer to leave the two copies in sync or let them go stale. The advantage of letting them go stale is that the disk I/O workload needed to keep them in sync is avoided. While you will need to catch-up later, the total effort to do this may well be significantly less than the continual effort of mirroring them.
Example configuration
We have a VG (called db2vg) with one copy. We know only one copy exists because each logical volume in the volume group has only one PV.
[AIX_LPAR_5:root] / > lsvg -l db2vg
db2vg: LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT jfsdb2log1 jfs2log 1 1 1 open/syncd N/A jfsdb2log2 jfs2log 1 1 1 open/syncd N/A jfsdb2log3 jfs2log 1 1 1 open/syncd N/A db2binlv jfs2 14 14 1 open/syncd /db2 db2loglv jfs2 10 10 1 open/syncd /db2log db2datalv jfs2 40 40 1 open/syncd /db2data
If I display the detailed view of the relevant VG I can see the VG is currently in a good state
[AIX_LPAR_5:root] / > lsvg -L db2vg VOLUME GROUP: db2vg VG IDENTIFIER: 00f771ac00004c0000000144bf115a1e VG STATE: active PP SIZE: 512 megabyte(s) VG PERMISSION: read/write TOTAL PPs: 71 (36352 megabytes) MAX LVs: 512 FREE PPs: 4 (2048 megabytes) LVs: 6 USED PPs: 67 (34304 megabytes) OPEN LVs: 6 QUORUM: 2 (Enabled) TOTAL PVs: 1 VG DESCRIPTORS: 2 STALE PVs: 0 STALE PPs: 0 ACTIVE PVs: 1 AUTO ON: yes MAX PPs per VG: 130048 MAX PPs per PV: 1016 MAX PVs: 128 LTG size (Dynamic): 512 kilobyte(s) AUTO SYNC: no HOT SPARE: no BB POLICY: relocatable PV RESTRICTION: none INFINITE RETRY: no
We have added one new disk to the server. We know it’s not in use because it has no VG (it says none).
[AIX_LPAR_5:root] / > lspv hdisk0 00f771acd7988621 None hdisk5 00f771acbf1159f6 db2vg active hdisk6 00f771ac41353d73 rootvg active
[AIX_LPAR_5:root] / > lsdev -Cc disk hdisk0 Available C9-T1-01 MPIO IBM 2076 FC Disk hdisk5 Available C9-T1-01 MPIO IBM 2076 FC Disk hdisk6 Available C9-T1-01 MPIO IBM 2076 FC Disk
We extend the VG onto the new staging disk and then mirror it. We specify the VG name (db2vg) and the name of the unused or free disk (hdisk0).
It takes a while so we run the mirrorvg command as a background task with &
[AIX_LPAR_5:root] / > extendvg db2vg hdisk0 [AIX_LPAR_5:root] / > mirrorvg db2vg hdisk0 & 0516-1804 chvg: The quorum change takes effect immediately.
We monitor the mirroring with a script. I did not write this script but did modify it. The original author (W.M. Duszyk) should thus be acknowledged! Also thanks to Chris Gibson for help with this.
#!/usr/bin/ksh93 ### W.M. Duszyk, 3/2/12 ### AVandewerdt 01/05/14 ### show percentage of re-mirrored PPs in a volume group [[ $# < 1 ]] && { print "Usage: $0 vg_name"; exit 1; } vg=$1 printf "Volume Group $vg has ";lsvg -L $vg | grep 'ACTIVE PVs:' | awk '{printf $3}';printf " copies " Stale=`lsvg -L $vg | grep 'STALE PPs:' | awk '{print $6}'` [[ $Stale = 0 ]] && { print "and is fully mirrored."; exit 2; } Total=`lsvg -L $vg | grep 'TOTAL PPs:' | awk '{print $6}'` PercDone=$(( 100 - $(( $(( Stale * 50.0 )) / $Total )) )) echo "and is mirrored $PercDone%." exit 0
We can use this script to check if the VG is in sync. You run the script and specify the name of the VG:
[AIX_LPAR_5:root] / >./checkvg.sh db2vg
Volume Group db2vg has 2 copies and is mirrored 85%.
We wait for it to reach 100%
[AIX_LPAR_5:root] / > ./checkvg.sh db2vg
Volume group db2vg has 2 copies and is fully mirrored.
If you want to see the exact state of the VG, lets look at the volume group details. Note how each LV now has 2 PPs and the LV state is open/syncd. An LV state of closed/syncd is not an issue if the LV is actually raw (rather than using a file system) and it is not being used by the application.
[AIX_LPAR_5:root] / > lsvg -l db2vg db2vg: LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT jfsdb2log1 jfs2log 1 2 2 open/syncd N/A jfsdb2log2 jfs2log 1 2 2 open/syncd N/A jfsdb2log3 jfs2log 1 2 2 open/syncd N/A db2binlv jfs2 14 28 2 open/syncd /db2 db2loglv jfs2 14 28 2 open/syncd /db2log db2datalv jfs2 40 80 2 open/syncd /db2data
Now display that LV. We can see hdisk0 is copy 2 (PV 2). This is good.
[AIX_LPAR_5:root] / > lslv -m db2binlv db2binlv:/db2 LP PP1 PV1 PP2 PV2 PP3 PV3 0001 0002 hdisk5 0002 hdisk0 0002 0003 hdisk5 0003 hdisk0 0003 0004 hdisk5 0004 hdisk0 0004 0005 hdisk5 0005 hdisk0 0005 0006 hdisk5 0006 hdisk0 0006 0007 hdisk5 0007 hdisk0 0007 0008 hdisk5 0008 hdisk0 0008 0009 hdisk5 0009 hdisk0 0009 0010 hdisk5 0010 hdisk0 0010 0011 hdisk5 0011 hdisk0 0011 0012 hdisk5 0012 hdisk0 0012 0013 hdisk5 0013 hdisk0 0013 0014 hdisk5 0014 hdisk0 0014 0015 hdisk5 0015 hdisk0
We are now ready to snapshot the staging disk to preserve its state as it is in the synced state. Once the snapshot is created, we can let the mirror go stale so that there is no disk load to keep the staging disk in sync. You should co-ordinate this snapshot with the application writing to the disk. With Actifio we do this with the Actifio Connector software.
Once the snapshot is taken we can split the VG to stop the workload of mirroring. We are going to split off copy 2, which is the copy that is on our staging disk (hdisk0). So now we split off a copy:
splitvg -c2 db2vg
The new copy is called vg00. You can force AIX to use a different name.
[AIX_LPAR_5:root] / > splitvg -c2 db2vg [AIX_LPAR_5:root] / > lsvg db2vg rootvg vg00
If we check db2vg we can see it still shows 2 PPs but actually we are no longer keeping the second copy (on hdisk0) in sync.
[AIX_LPAR_5:root] / > lsvg -l db2vg db2vg: LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT jfsdb2log1 jfs2log 1 2 2 open/syncd N/A jfsdb2log2 jfs2log 1 2 2 open/syncd N/A jfsdb2log3 jfs2log 1 2 2 open/syncd N/A db2binlv jfs2 14 28 2 open/syncd /db2 db2loglv jfs2 10 20 2 open/syncd /db2log db2datalv jfs2 40 80 2 open/syncd /db2data
When we look at our newly created VG (vg00) it does not have 2 copies.
[AIX_LPAR_5:root] / > lsvg -l vg00 vg00: LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT fsjfsdb2log1 jfs2log 1 1 1 closed/syncd N/A fsjfsdb2log2 jfs2log 1 1 1 closed/syncd N/A fsjfsdb2log3 jfs2log 1 1 1 closed/syncd N/A fsdb2binlv jfs2 14 14 1 closed/syncd /fs/db2 fsdb2loglv jfs2 10 10 1 closed/syncd /fs/db2log fsdb2datalv jfs2 40 40 1 closed/syncd /fs/db2data
Curiously while we show as being in sync the sync actually is stale by 3 PPs already:
[AIX_LPAR_5:root] / > chmod 755 checkvg.sh;./checkvg.sh db2vg
Volume Group db2vg has 1 copies and is mirrored 99%.
I generate some change by copying some files to /db2data to increase this difference. Of course if DB2 is really running then changes will start occurring straight away.
[AIX_LPAR_5:root] / > ./checkvg.sh db2vg
Volume Group db2vg has 1 copies and is mirrored 97%.
If we check the state of the LVs we can see that this file I/O has created stale partitions. This is not a problem. The speed with which partitions become stale will depend on the size of the PPs and the address range locality of typical IOs generated between snapshots.
[AIX_LPAR_5:root] / > lsvg -l db2vg db2vg: LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT jfsdb2log1 jfs2log 1 2 2 open/stale N/A jfsdb2log2 jfs2log 1 2 2 open/stale N/A jfsdb2log3 jfs2log 1 2 2 open/stale N/A db2binlv jfs2 14 28 2 open/stale /db2 db2loglv jfs2 10 20 2 open/stale /db2log db2datalv jfs2 40 80 2 open/stale /db2data
When we are ready to take the next snapshot we need to get the two copies back together and in sync. To do this we rejoin the two with this command:
joinvg db2vg
We can see the two start coming back to sync:
[AIX_LPAR_5:root] / > ./checkvg.sh db2vg
Volume Group db2vg has 2 copies and is mirrored 98%.
When the two get into sync we can clearly see this as the state is syncd rather than stale.
[AIX_LPAR_5:root] / > lsvg -l db2vg db2vg: LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT jfsdb2log1 jfs2log 1 2 2 open/syncd N/A jfsdb2log2 jfs2log 1 2 2 open/syncd N/A jfsdb2log3 jfs2log 1 2 2 open/syncd N/A db2binlv jfs2 14 28 2 open/syncd /db2 db2loglv jfs2 10 20 2 open/syncd /db2log db2datalv jfs2 40 80 2 open/stale /db2data
If the resync does not occur, we can force it with the syncvg command:
syncvg -v db2vg
Once we are in sync, we can do another snapshot of the staging disk.
Issues with scripting this:
One thing you may want to do is allow a non-root user to perform these commands. So for instance if we want to allow the DB2 user (in this example db2inst2) to execute splitvg and joinvg commands we can use sudo to do this.
- Download and install sudo on the AIX host
- Issue this command to edit the sudo config file: visudo
- Add this line:
db2inst2 ALL = NOPASSWD: /usr/sbin/joinvg,/usr/sbin/splitvg
Log on as the DB2 user and check that it worked:
[AIX_LPAR_5:db2inst2] /home/db2inst2 > sudo -l User db2inst2 may run the following commands on this host: (root) NOPASSWD: /usr/sbin/joinvg (root) NOPASSWD: /usr/sbin/splitvg
Using the snapshot with a backup host
One strategy that can be used in combination with this method is to present the snapshot to a server running backup software. The advantage of doing this is that the backup can effectively be done off-host. The disadvantage is that each backup will be a full backup unless the backup software can scan the disk for changed files or blocks.
Import the VG
To use the snapshot, connect to the management interface of the storage device that created the snapshot and map it to your backup host. Then logon to the backup host and discover the disks:
cfgmgr
Learn the name of the hdisk
lspv lsdev -Cc disk
Then import the volume group. You need to use -f to force an import with only half the VG members present (since you are importing a snapshot of one half of a mirrored pair). In this example we have discovered hdisk1 and are using it to import the VG db2vg.
importvg -y db2vg hdisk1 -f
Recreate the VG
If you are presenting the snapshot back to the same host that has the original VG, then we have to do two extra steps. Because the snapshot has the same PVID as the staging disk you need to change the PVID and use the recreatevg command, not the importvg command.
In this example I have two VGs and two disks.
[aix_lpar_4:root] / > lspv hdisk0 00f771acc8dfb10a actvg active hdisk2 00f771accdcbafa8 rootvg active
I map the snapshot I created and run cfgmgr. If you are sharp eyed you will spot I don’t have any PVID clashes. Actually I don’t even have the original DB2 VG, but the method is still totally valid.
[aix_lpar_4:root] / > cfgmgr [aix_lpar_4:root] / > lspv hdisk0 00f771acc8dfb10a actvg active hdisk2 00f771accdcbafa8 rootvg active hdisk4 00f771acd7988621 None
We need to bring the VG online, so we clear the PVID
[aix_lpar_4:root] / > chdev -l hdisk4 -a pv=clear hdisk4 changed [aix_lpar_4:root] / > lspv hdisk0 00f771acc8dfb10a actvg active hdisk2 00f771accdcbafa8 rootvg active hdisk4 none None
We now build a new VG using the VG name db2restorevg on hdisk4.
[aix_lpar_4:root] / > recreatevg -f -y db2restorevg hdisk4 db2restorevg [aix_lpar_4:root] / > lsvg -l db2restorevg db2restorevg: LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT fsjfsdb2log1 jfs2log 1 1 1 closed/syncd N/A fsjfsdb2log2 jfs2log 1 1 1 closed/syncd N/A fsjfsdb2log3 jfs2log 1 1 1 closed/syncd N/A fsdb2binlv jfs2 14 14 1 closed/syncd /fs/db2 fsdb2loglv jfs2 14 14 1 closed/syncd /fs/db2log fsdb2datalv jfs2 40 40 1 closed/syncd /fs/db2data
Again if you are sharp eyed you will spot in the output above every LV has fs added to its name. In other words db2binlv that was mounted on /db2 is recreated as fsdb2binlv mounted on /fs/db2. This is done because the recreatevg command assumes you are creating this VG on a host that already has this VG. So it renames constructs to prevent name clashes. If for some reason you don’t want this renaming to occur, you can avoid it in the recreatevg command like this, where -L / and -Y NA forces the command to not rename any labels. Use this with care.
recreatevg -f -L / -Y NA -y db2restorevg hdisk4
Backups without backup software or file system scans.
If the staging disk is presented by Actifio, then Actifio will track every changed block and will only need to read the changed blocks to create a new backup image of the snapshot. The VG PP size will play a role in determining the quantity of changed blocks. This effectively allows backups without backup software since the Actifio Dedup engine can read blocks straight from snapshots created by Actifio. This is a very neat trick. Also since we presented the staging disk from the Actifio snapshot pool, we now also have a copy that we can present at will for instant test and dev or analytics purposes.
Scripting for Application Consistency
When creating the snapshot, you ideally want the whole process to be orchestrated where a regular update job is run according to a schedule. The process should get the VG mirror back into sync, get the application into a consistent state (such as hot backup mode), create a snapshot and then let the VG mirror go stale again.
The Actifio Connector can be used to coordinate application consistency. Clearly if your staging disk is coming from a different storage product then you will need to use that vendors method. Every time Actifio starts a snapshot job (which can be automated by the Actifio SLA scheduling engine) it can call the Actifio Connector installed on the host to help orchestrate the snapshot. It does so in phases: init; thaw; freeze; fini and if necessary abort. We set the database name and path and VGname at the start of the script. The init phase re-syncs the VG; the thaw phase puts DB2 into hot backup mode; the freeze phase takes DB2 out of hot backup mode; the fini phase splits the VGs.
#!/bin/sh
DBPATH=/home/db2inst2/sqllib/bin
DBNAME=demodb
VGNAME=db2vg
if [ $1 = "freeze" ];then
$DBPATH/db2 connect to $DBNAME
$DBPATH/db2 set write suspend for database
exit 0
fi
if [ $1 = "thaw" ];then
$DBPATH/db2 connect to $DBNAME
$DBPATH/db2 set write resume for database
exit 0
fi
if [ $1 = "init" ];then
sudo joinvg $VGNAME
while true
do
synccheck=$(/act/scripts/checkvg.sh $VGNAME)
if [ "$synccheck" != "Volume Group $VGNAME has 2 copies and is fully mirrored." ]
then
echo $synccheck
sleep 30
else
break
fi
done
exit 0
fi
if [ $1 = "fini" ];then
echo "Splitting $VGNAME"
sudo splitvg -c2 $VGNAME
exit 0
fi
if [ $1 = "abort" ];then
$DBPATH/db2 connect to $DBNAME
$DBPATH/db2 set write resume for database
exit 0
fi
Hopefully this whole process is helpful whether you use Actifio or not. Here is a small set of references which helped me with this: