Wednesday, April 4, 2012

Online - hot extend of a physical volume on an active volumegroup

Two scenarios will be tested. The goal of this test is to see what options we have to online extend a Volumegroup (Physical Volume) on a VMware Linux guest and to make a summary of the different methods to achieve this goal. In all scenarios a vmdk is online extended as first step, in all our test scenarios the disk to be extended is /dev/sdb. We did not use multipathing software in our test (in case you have a physical machine), but this is stuff for another topic, feel free to make any comments about this.

Both scenarios are tested on 32 bit machines:


Host: ESX 4.1.0 virtual machine version 7
Guest: CentOS release 5.7 - kernel 2.6.18-274.el5

Host: ESX 4.1.0 virtual machine version 4
Guest: SLES10 2.6.16.60-0.85.1-smp

Host: ESX 4.1.0 virtual machine version 4
Guest: SLES11 3.0.13-0.27-pae

Host: ESX 5 virtual machine version 8
 Guest: RHEL6.2 2.6.32-220.el6.i686



Scenario 1: Online extend pv created in first partition of a device (e.g. /dev/sdb1) of an active vg


Steps taken:
  1. Extend disk in vmware
  2. result: fdisk -l /dev/sdb does not show extended size in guest   
  3. echo 1 > /sys/block/sdb/device/rescan or rescan-scsi-bus.sh --forcerescan (only possible on SUSE)
  4. result: fdisk -l /dev/sdb shows extended size of /dev/sdb , now we can extend /dev/sdb1    
  5. do fdisk /dev/sdb, in fdisk execute following steps:
  • delete partition /dev/sdb1
  • create new partition /dev/sdb1 with uses the extended space.
  • put LVM id (8e)
  • write partition table changes to disk
Writing the changes gives the error:

WARNING: Re-reading the partition table failed with error 16: Device or resource busy. The kernel still uses the old table. The new table will be used at the next reboot.

Until now the new partition table is written to disk but the kernel is still using the old in-memory partition table (see cat /proc/partitions). If we try to inform the kernel of the new size of /dev/sdb1 with partprobe we get no result.
Following errors were observed:
SLES11:Error: Partition(s) 1 on /dev/sdb have been written, but we have been unable to inform the kernel of the change, probably because it/they are in use. As a result, the old partition(s) will remain in use. You should reboot now before making further changes.
RHEL6:Warning: WARNING: the kernel failed to re-read the partition table on /dev/sda (Device or resource busy). As a result, it may not reflect all of your changes until after reboot.  

Note: If we do not extend first partition but create a new partition instead (/dev/sdb2) for use as a new physical volume, then this second partition can also not be made visible to the kernel with partprobe if this partition is in use/active. So in effect this is the same problem as just extending the first partition.

The only thing that is holding us back to successfully do a pvresize on /dev/sdb1 is that it is not possible to let the kernel read the updated partition table of a partition table that is in use.

So what to do if you have a physical volume created on a partition (e.g. /dev/sdb1) and you do not want to reboot your server to add some free space in your volumegroup?:
Just do not extend the existing vmdk but add another vmdk to the guest (on the guest rescan scsi bus to see device: echo "- - -" > /sys/class/scsi_host/hostX), use this new vmdk as another pv in your vg (e.g. pvcreate /dev/sdc).  This workaround is simple but can become cluttered if you add to many devices in following up extensions of the volumegroup.

Off course if there is no objection by your business by making the server/filesystem unavailable for some time, you can reboot the server or you can  umount lv's and do a vgchange -a n VG to make the partition not in use by the kernel. After this just do a partprobe to let the kernel use the new partition table, this will be succesful (/proc/partitions will also get updated). Do not forget to make your volumegroup active again ( vgchange -a y VG) before trying to remount your filesystems.


Scenario 2: online extend pv, created directly on the device (/dev/sdb), of an active vg
Steps taken:
  1. Extend disk in vmware
  2. result: fdisk -l /dev/sdb does not show extended size in guest 
  3. blockdev --rereadpt /dev/sdb
  4.  result: fdisk -l /dev/sdb now shows extended size, as well as cat /proc/partitions Note: partprobe was not necessary because we do not have partitions here. I do not know why we do not have to rescan the scsi bus though. 
  5. pvresize /dev/sdb (= succes)

Conclusion

In case of a virtual Linux VMware guest online extend of a physical volume on an active VG is only possible if it is created directly on a disk (e.g. pvcreate /dev/sdb).
If the PV is created on partition, that partition needs to be extended first. This updated partition table can only be read in by the kernel (partprobe) if this partition/disk is not in use.