Online Backups and Recovery in a Snap AIX
Online Backups and Recovery in a Snap
The ability to perform a JFS2 file system backup online means minimal downtime—and that’s a good thing in any system admin’s book. Using AIX snapshots on JFS2 file systems, you can do just that—create an online backup. With this process, three kinds of file system backups are made at a point in time. And because the data file blocks are copied using the copy-on-write method, the resulting snapshot doesn’t require much space. The different types of snapshots are internal, external and backsnap. Here, we’ll focus on external snapshots and backsnaps.
Why Snapshot
Why use snapshot instead of the tar or copy utilities? With snapshot, the file system is frozen, ensuring you get a full copy, and avoid “open file,” “running process” or “file not found” issues. Also, there’s generally no need to shut down an application, though I typically go for a quiesce on an application prior to doing a snapshot; then un-quiesce afterwards.
You can have up to 15 continuous external snapshots of a JFS2 file system. When upgrading applications within a test environment, it’s quite common to take a snapshot after each refresh of that environment contained in the file system.
Doing a JFS2 online backup is all well and good, but one also needs the ability to restore in case events go wrong. Using the rollback utility, external snapshots can be rolled back to the point when a snapshot was taken by specifying the device the snapshot resides on. It doesn’t get much better than that. For the backsnap rollback, use the restore command.
Personal Preferences
Which method is used to create the snapshots will depend on what you’re most comfortable with. So let’s go over the different types:
External snapshots
- Are created on any existing JFS2 file system
- Their space exists on a different logical volume
- Can be mounted as a separate file system
- Have a read-only data area
Backsnap
- Is primarily an interface for the snapshot command
- Does all the snapshot work for you
- Holds the resulting backup of the file system in an archived file or on tape
If a snapshot runs out of space, all snapshots for that file system will become invalid—in other words, unusable. In my own work, I prefer external snapshots, mainly because they can be mounted, if required, giving them a more visual presence.
External Snapshot
In this demonstration, the file system is called /opt/portal. With a size of 512M, it has just over 281M of data files.
# lsfs |grep -w portal /dev/fslv02 -- /opt/portal jfs2 1048576 rw yes no # df -m |grep portal /dev/fslv02 512.00 231.60 55% 7 1% /opt/portal # pwd /opt/portal # ls app_be app_fr app_nl lost+found
To create an external snapshot, as a rule, I create it at about half the size of the original file system. The IBM documentation recommends the size be 10 to 15 percent of the source file system. However, I like to give the snapshot plenty of room and err on the side of safety. In this scenario, roughly 50 percent translates into 250M. The common format to create a snapshot is:
snapshot -o snapfrom= -o size=
Now to create the external snapshot:
# snapshot -o snapfrom=/opt/portal -o size=250M Snapshot for file system /opt/portal created on /dev/fslv04
A logical volume is automatically created to hold the snapshot. To confirm it’s been created, query the snapshot of the file system in question, using the snapshot command, like so:
# snapshot -q /opt/portal Snapshots for /opt/portal Current Location 512-blocks Free Time * /dev/fslv04 524288 523520 Sun Nov 4 10:57:46 GMT 2012
At this point, I can mount the snapshot to view the copied data, but first the directory where it’s to be mounted, needs to be created:
# mkdir /snap_portal # mount -v jfs2 -o snapshot /dev/fslv04 /snap_portal # df -m |grep portal /dev/fslv02 512.00 231.60 55% 7 1% /opt/portal /dev/fslv04 256.00 255.62 1% - - /snap_portal
Now, I can cd into that snapshot ( /snap_portal) and view the copied files; note that the file system is read only with no writes allowed:
# cd /snap_portal # ls app_be app_fr app_nl lost+found # ls >file1 file1: The file system has read permission only.
At this point, I could take a further backup of the snapshot file system to tape or SAN. Assuming I’ve done some data maintenance on /opt/portal and accidentally removed a file, I could simply copy the removed file back from the mounted snapshot into /opt/portal. But for now, let’s create another snapshot of /opt/portal:
# snapshot -o snapfrom=/opt/portal -o size=250M Snapshot for file system /opt/portal created on /dev/fslv05
Like before, query the snapshots of /opt/portal:
# snapshot -q /opt/portal Snapshots for /opt/portal Current Location 512-blocks Free Time /dev/fslv04 524288 523520 Sun Nov 4 10:57:46 GMT 2012 * /dev/fslv05 524288 523520 Sun Nov 4 11:00:53 GMT 2012
In the above output the * denotes the most recent snapshot. If I no longer needed a snapshot I could delete it using the snapshot command. The common format to remove a snapshot is:
snapshot -d
For example, to remove the fslv04, I could use:
# snapshot -d /dev/fslv04
Suppose I now needed to restore the whole file system due to some update that had gone wrong on /opt/portal. I can choose which snapshot to rollback. For me, this is one of the main selling points of snapshots. If you’ve taken quite a few snapshots, you have multiple points from which to restore. For our demonstration, I’ll rollback the snap taken at 10:57 which is /dev/fslv04. I would first unmount all the snapshots (if they were mounted), then unmount /opt/portal, before issuing the rollback command. In this example, it’s Sunday, Nov. 4, at 10:57. The common format for the rollback command is:
rollback -v
When a rollback occurs, all snapshots of that file system will be removed. To rollback and restore to the original file system /opt/portal, I could use:
# umount /opt/portal # rollback -v /opt/portal /dev/fslv04 Restoring block 1 Restoring block 1000 Restoring block 2000 ... Restoring block 12000 Total blocks restored = 12809 rmlv: Logical volume fslv04 is removed. rmlv: Logical volume fslv05 is removed. Rollback complete
To confirm that no snapshots are left, list the snapshots:
# snapshot -q /opt/portal /opt/portal has no snapshots.
Now, we can remount /opt/portal and have the original contents of the file system restored!
# mount /opt/portal
The backsnap does most of the work for you in the background. It creates the logical volume to hold the snapshot, then copies the contents into an archived file or onto a tape device. The backed-up files can then be restored using the restore command. Using the /opt/portal file system to create the backsnap, let's see how it hangs together. A common format for the backsnap command is:
backsnap -m -s size -f
In this example, I will use the following values: /backsnap_portal is the temp mount point; 250M is the size; /opt/dump/backup_portal is the archive file; and /opt/portal will be the source file system.
# backsnap -m /backsnap_portal -s size=250M -f /opt/dump/backup_portal /opt/portal Snapshot for file system /opt/portal created on /dev/fslv05 backup: Backing up /dev/rfslv05 to /opt/dump/backup_portal. backup: There are an estimated 286768 1k blocks. backup: There are 287321 1k blocks on 1 volumes. backup: The backup is complete.
We now have an archive file called: backup_portal, in the directory:/opt/dump. Looking at the size of the archive, it’s 280M:
# du -ms /opt/dump/backup_portal 280.56 backup_portal
To confirm the files are present in the archive, I could list the files, using the restore command, like so:
# restore -tvf /opt/dump/backup_portal
To restore the file app_be from the archived file back to /opt/portal, I could use:
# cd /opt/portal # restore -xvf /opt/dump/backup_portal app_be. Extracting directories from media. Initializing the symbol table. Extracting requested files.. Specify the next volume number: 1 Extracting file ./app_be.
To take a snap and place the files on tape rmt0, using the attributes as in the previous backsnap, I could use:
# backsnap -m /backsnap_portal -s size=250M -f /dev/rmt0 /opt/portal
If you’ve taken a snapshot and the contents have gone to tape, use the restore command to list the tape, like so:
# restore -tvf /dev/rmt0
The snapshot taken can also be listed using the command:
# snapshot -q /opt/portal Snapshots for /opt/portal Current Location 512-blocks Free Time * /dev/fslv08 524288 522496 Sun Nov 4 11:30:25 GMT 2012
If you decide to remove the snapshot, the archived file or the files that went to tape will remain intact, unlike the removal of an external snapshot. That is, unless you’ve previously taken a further backup to some media of that snapshot.
Keep an Eye on Space
As a system admin, I believe snapshots are a great way of taking online backups, and having the capability of restores if events go wrong. Remember, though, to keep an eye on the space used in the file systems; you don’t want to get invalid snapshots because you ran out of space.