[virt-tools-list] Live migration of iscsi based targets
Phil Meyer
pmeyer at themeyerfarm.com
Wed Oct 27 17:09:36 UTC 2010
On 10/27/2010 03:10 AM, Gildas Bayard wrote:
> Hello,
>
> I'm using libvirt and KVM for a dozen of virtual servers. Each virtual
> server's disk is an iscsi LUN that is mounted by the physical host
> blade which runs KVM.
> Every thing works fine at that stage for about a year. Both the
> servers and the blade are running ubuntu server 10.04 LTS.
> I've been trying live migration for a while but it was not working, at
> least with my setup, on previous versions of ubuntu (actually
> virt-manager was showing the VM on the target host but the machine
> became unreachable by the network).
>
> Anyway for some reasons it's working now. But there's a big pb: let
> say I use 2 blades (A and B) to host my VMs. If I start a VM on blade
> A and live migrate it to blade B everything is fine. But if I migrate
> it back to blade A awful things happen: at first it's ok but sooner or
> later the VM will complain about disk corruption and destroy itself
> more and more as time goes by. Oops!
>
> My understanding is that blade A got it's iscsi disk cache up and
> running and that when the VM comes back, blade A has no way to know
> that the VM got its disk altered by its stay on blade B for a while.
> Hence the corruption.
>
> Am I getting this correct? Should I switch to NFS "disk in a file"
> instead of using iscsi?
>
> Sincerely,
> Gildas
This is how we do it here.
Prior to the live migrate:
Add permissions to the iscsi target for the New Host.
New Host discovers and logs into the iscsi target.
Live migrate:
virsh -c qemu+ssh://root\@<Old host>/system migrate --live <domain>
qemu+ssh://root\@<New Host>/system"
On the new host:
virsh dumpxml <domain> > /tmp/<domain>.xml
virsh define /tmp/<domain>.xml
On the old host
virsh undefine <domain>
iscsiadm -m node --logout -T <iscsi-target>
iscsiadm -m node -T <iscsi-target> -o delete
Remove permissions from the iscsi target for the Old Host
That seems to work reliably for us.
As for iscsi vs. NFS:
We tested this extensively and determined that a reasonably optimized
iscsi based storage array using one iscsi target per VM achieved vastly
improved I/O performance.
We tested Linux based iscsi target hosts using both TGT and
iscsi-target, as well as several vendor provided iscsi storage arrays.
As a result of this testing, we settled on the Dell EquaLogic arrays.
They can support 512 connections per pool, which (barely) meets our very
specific requirements.
But the kicker for EQ is the dedicated 4GB cache. Working within the
cache from up to 500 VMs, we were able to achieve a sustained 7000 iops
using 14 normal SATA drives (16 in RAID 6).
Normally those drives on a pair of 3Ware controllers (512MB cache on the
controller) maxed out at 1500 iops no matter how we configured the targets.
We are keeping a close eye on the Isilon products, and we have an Isilon
cluster in production for NFS, but they now have a fair to good iscsi
implementation and its getting better each release. Version 6 may bring
it to par (or better) with anyone else. However, the Isilon's ability
to grow nearly forever with no hassle is a major selling point.
Dropping a new unit into the cluster and having those new resources
available within 60 seconds is a very long way from the 36 hour wait to
VERIFY a new EQ array!
Good Luck!
More information about the virt-tools-list
mailing list