Linux LVM recovery


LVM recovery

Few days ago i made mistake and forced fsck to check partition that contain LVM instead of logic volume, as result i got broken LVM metadata. I was unable to see volume group an logic volumes.
pvs output looked like that:

# pvs -v

Scanning for physical volume names
Incorrect metadata area header checksum

I tried to run pvck but it did not help me, it founded corrupted metadata but did not repair LVM:

# pvck -d -v /dev/md5
Scanning /dev/md5
Incorrect metadata area header checksum
Found label on /dev/md5, sector 1, type=LVM2 
Found text metadata area: offset=4096, size=193024
Incorrect metadata area header checksum

Finally i founded that it’s possible to make backups of LVM metadata and restore it when needed, but i think that i had only broken LVM with broken metadata.
It’s hard to describe how happy I was when I found that by default LVM create backups of metadata when you make any changes. I found it into /etc/lvm/backup dir, after that recovery become easy task, first i recreate physical volume:

pvcreate -u b3Lk2a-pydG-Vhf3-DSEJ-9b84-RLm9-UEr6r3 --restorefile /etc/lvm/backup/vg-320 /dev/md5

UUID can be founded in pv section into metadata file:

 physical_volumes {
 
 pv0 {
 id = "<strong>b3Lk2a-pydG-Vhf3-DSEJ-9b84-RLm9-UEr6r3</strong>"
 device = "/dev/md5" # Hint only

Next i restored volume group:

vgcfgrestore -f /etc/lvm/backup/vg-320 vg-320

After that logical volumes became visible:

# lvs
 LV VG Attr LSize Pool Origin Data% Move Log Copy% Convert
 root vg-320 -wi-a--- 15.00g 
 swap vg-320 -wi-a--- 1.00g 
 var vg-320 -wi-ao-- 200.00g 
 zoneminder vg-320 -wi-a--- 15.00g

After reinitialization with vgscan -v && vgchange -ay commands, volume groups ready for fsck.

This was my home file storage setup. It does not have backups because the RAID setup was meant to be the redundancy. I did not account for what happened and am paying the price. The setup:

  • Ubuntu 16.04
  • Four-disk RAID 5 array using mdadm (4x2TB): /dev/md0
  • On the array, a PV and LV managed by LVM.
  • On the Logical Volume named vg0, an XFS file system.

Note that the Linux host, including /etc and /boot, are installed on a different disk and are completely accessible (so I have access to /etc/lvm/archive). The RAID array is purely file storage, the boot process has no dependency on it at all other than its entry in /etc/fstab.

For whatever reason I had booted from a FreeDOS installer which I was struggling to understand. I think I may have told it to repartition this volume, although I cannot remember doing so. In any case, when I rebooted into Linux (Ubuntu 16.04), I was dropped into a recovery mode prompt as the root user. It could not mount the UUID of the volume group as defined in /etc/fstab.

It has been sufficiently long enough since I originally setup this RAID array that I completely forgot how LVM worked, or that I even used LVM to create the volume. (10-12 years, replacing hard disks and resizing the array occasionally over the course of that time.) So, first I tried to use testdisk [1] to find and restore the partition information. This never worked, the partition was always the incorrect size (524Gb instead of 4.5TB) and never on a "physical sector boundary." I experimented with various geometries thinking there was a magic combination that would perfectly restore the partition. Here is the current status of the disk according to fdisk:

$ sudo fdisk -l /dev/md0
GPT PMBR size mismatch (1098853631 != 200894463) will be corrected by w(rite).
Disk /dev/md0: 4.1 TiB, 4500904476672 bytes, 8790829056 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 1048576 bytes / 3145728 bytes
Disklabel type: dos
Disk identifier: 0x00000000

Device     Boot Start        End    Sectors  Size Id Type
/dev/md0p1          1 1098853631 1098853631  524G ee GPT

Partition 1 does not start on physical sector boundary.

And parted:

(parted) print list                                                       
Error: /dev/md0: unrecognised disk label
Model: Linux Software RAID Array (md)                                     
Disk /dev/md0: 4501GB
Sector size (logical/physical): 512B/4096B
Partition Table: unknown
Disk Flags: 

In posting a question to the testdisk forum [2], I realized that I had used LVM to manage the RAID array, and that it was possible that these do not use a traditional partitioning tool at all. Researching "recovering lvm physical volumes" dug up http://blog.adamsbros.org/2009/05/30/recover-lvm-volume-groups-and-logical-volumes-without-backups/. pvck tells me the following:

$ sudo pvck /dev/md0
  Incorrect metadata area header checksum on /dev/md0 at offset 4096
  Found label on /dev/md0, sector 1, type=LVM2 001
  Found text metadata area: offset=4096, size=192512
  Incorrect metadata area header checksum on /dev/md0 at offset 4096

I also have several backups of the LVM volume in /etc/lvm/archives, the latest being the following:

crw@bilby:~$ sudo cat /etc/lvm/archive/vg0_00002-935168089.vg
# Generated by LVM2 version 2.02.98(2) (2012-10-15): Sun Jul 19 12:00:04 2015

contents = "Text Format Volume Group"
version = 1

description = "Created *before* executing 'lvextend /dev/vg0/lv0 /dev/md0'"

creation_host = "bilby" # Linux bilby 3.16.0-43-generic #58~14.04.1-Ubuntu SMP Mon Jun 22 10:21:20 UTC 2015 x86_64
creation_time = 1437332404  # Sun Jul 19 12:00:04 2015

vg0 {
    id = "Q4ZRRc-1l0h-FEgu-jrxA-EfW1-tAis-vv0jyL"
    seqno = 5
    format = "lvm2" # informational
    status = ["RESIZEABLE", "READ", "WRITE"]
    flags = []
    extent_size = 262144        # 128 Megabytes
    max_lv = 0
    max_pv = 0
    metadata_copies = 0

    physical_volumes {

        pv0 {
            id = "bKQs0l-zNhs-X4vw-NDfz-IMFs-cJxs-y0k6yG"
            device = "/dev/md0" # Hint only

            status = ["ALLOCATABLE"]
            flags = []
            dev_size = 8790828672   # 4.09355 Terabytes
            pe_start = 384
            pe_count = 33534    # 4.09351 Terabytes
        }
    }

    logical_volumes {

        lv0 {
            id = "pqInOe-ZLpV-t9oK-GQE1-AoIt-mB3M-4ImaV1"
            status = ["READ", "WRITE", "VISIBLE"]
            flags = []
            segment_count = 1

            segment1 {
                start_extent = 0
                extent_count = 22356    # 2.729 Terabytes

                type = "striped"
                stripe_count = 1    # linear

                stripes = [
                    "pv0", 0
                ]
            }
        }
    }
}

If it is helpful, the following is the detail on the RAID array:

$ sudo mdadm --detail /dev/md0
/dev/md0:
        Version : 0.90
  Creation Time : Sun Oct 11 13:34:16 2009
     Raid Level : raid5
     Array Size : 4395414528 (4191.79 GiB 4500.90 GB)
  Used Dev Size : 1465138176 (1397.26 GiB 1500.30 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Mon Oct  3 13:12:51 2016
          State : clean 
 Active Devices : 4
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 1024K

           UUID : 9be3b2f7:102e373a:822b5a8f:216da2f7 (local to host bilby)
         Events : 0.103373

    Number   Major   Minor   RaidDevice State
       0       8       64        0      active sync   /dev/sde
       1       8       48        1      active sync   /dev/sdd
       2       8       16        2      active sync   /dev/sdb
       3       8       32        3      active sync   /dev/sdc

Finally, here is the sad trail of testdisk.log that I have left behind: https://dl.dropboxusercontent.com/u/2776730/testdisk.log

edit: output of lsblk:

crw@bilby:~$ sudo lsblk
NAME                 MAJ:MIN RM  SIZE RO TYPE  MOUNTPOINT
sda                    8:0    0 59.6G  0 disk  
├─sda1                 8:1    0  243M  0 part  /boot
├─sda2                 8:2    0    1K  0 part  
└─sda5                 8:5    0 59.4G  0 part  
  ├─bilby--vg-root   252:0    0 43.4G  0 lvm   /
  └─bilby--vg-swap_1 252:1    0   16G  0 lvm   [SWAP]
sdb                    8:16   0  1.8T  0 disk  
└─md0                  9:0    0  4.1T  0 raid5 
sdc                    8:32   0  1.8T  0 disk  
└─md0                  9:0    0  4.1T  0 raid5 
sdd                    8:48   0  1.8T  0 disk  
└─md0                  9:0    0  4.1T  0 raid5 
sde                    8:64   0  1.8T  0 disk  
└─md0                  9:0    0  4.1T  0 raid5 

I am completely lost and suspect I have made things worse. My questions are:

Do I need to "fix" the partition information before dealing with LVM issues? Should I attempt to "pvcreate --uuid xxx --restorefile yyy"? And then would I need to extend the disk, and run something like the xfs equivalent of fsck? Or is my data lost to me at this point? :'(

Please let me know if there is anything I can add to make debugging this issue easier. Thanks!

 

If any of this starts to not work, or stops making sense, STOP and ask a subject matter expert. This is unsafe work. Operate on disk images copied by "dd" to either files on a large storage medium, or directly to new disks of equal or greater size to protect your original dataset from tomfoolery. You may perform these operations on a single live set, but if you mess up that could be it for your data.

Alright. To begin, we need to repair this storage stack methodically, from the base disk level up. You ran a FreeDOS installer, and that messed with your disks by (presumably) creating a partition table on one of them.

Your disks participate in the MD array directly, no partition table to speak of. This is fairly typical. However, this is also a 0.90 revision metadata structure on that array, so putting a partition table on any of those disks directly will mess with the array.

Check as to whether you have a disk (any from sdb to sde) that has a partition table on it, in the form of /dev/sdb1 for example. If you have one like this, you will need to consider it dirty and take it out of your array, placing it back in after getting rid of that table.

Even if we don't see a partition on one of those disk, an integrity check needs to be run on /dev/md0. The command to do this is simple:

# /usr/share/mdadm/checkarray -a /dev/mdX

If that comes back with a mismatch count of greater than zero, then that array will have to be repaired. We'll visit that if need be, as it currently doesn't look like the issue.

On to more concrete problems, testdisk put a GPT on /dev/md0, and a partition on that disk (/dev/md0p1). This was never supposed to be there, and is corrupting your LVM metadata. You volume group is meant to reside directly on /dev/md0, as that's the way you originally created it.

First, we will have to deal with that errant GPT on /dev/md0. It needs to be "zapped". Zapping a GPT will blank all GPT structures, returning it to a disk with no table, as it should be in this case. This article details that excellently: "http://www.rodsbooks.com/gdisk/wipegpt.html". If you don't zap it, you will have a broken GPT structure on that disk that partitioning utilities will try to "correct", causing problems for you down the road all over again.

After doing that, you now can re-create all of your LVM metadata using the archive file you posted in your question. Thankfully, you've given me enough information to just hand you a command that will work. If you want to know more about this process, this is a great resource: "https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Logical_Volume_Manager_Administration/mdatarecover.html".

The command to recreate your physical volume with all of its original metadata:

# pvcreate --uuid "bKQs0l-zNhs-X4vw-NDfz-IMFs-cJxs-y0k6yG" --restorefile /etc/lvm/archive/vg0_00002-935168089.vg

This archive file describes /dev/md0 as being the disk that constitutes your volume group, and will use that, as it should. If you have a later archive file in your LVM archives directory, USE THAT INSTEAD. The goal is to bring the volume group to its latest valid state.

After this, checking your PV, VG, and LV integrity is key. You've already attempted this, but this time it should be more productive. The commands pvck and vgck are what should be used here.

First, perform pvck:

# pvck /dev/md0

After that validates, run vgck:

# vgck vg0

Once you have validated all metadata, it's time to activate your LVs, if they are not already:

# vgchange -ay vg0

And finally, checking the filesystem on /dev/mapper/vg0-lv0 (which in your case is XFS) for potential errors:

# xfs_check /dev/mapper/vg0-lv0

That should return nothing if there are no errors. If something is amiss, then xfs_repair will be necessary (DON'T DO THIS WHILE IT's MOUNTED):

xfs_repair /dev/mapper/vg0-lv0



Article Number: 464
Posted: Wed, Jan 23, 2019 9:07 PM
Last Updated: Wed, Jan 23, 2019 9:15 PM

Online URL: http://kb.ictbanking.net/article.php?id=464