[virt-tools-list] Failure to update CPU usage display
Charles Arnold
carnold at suse.com
Thu Apr 2 18:04:08 UTC 2015
>>> On 4/1/2015 at 05:14 PM, Cole Robinson <crobinso at redhat.com> wrote:
> On 04/01/2015 05:42 PM, Charles Arnold wrote:
>> Under certain conditions I see the guest CPU usage display not
>> being updated. The problem seems most prevalent when Xen
>> is the underlying hypervisor but I have seen it occasionally with
>> KVM.
>>
>> What appears to be happening is that when a VM is created, the
>> engine's _handle_tick_queue() calls the connection tick() method
>> which in turn calls _tick(). In here it calls _update_vms() which
>> gets a new vmmDomain object. Then it spawns a new thread to
>> call tick_send_signals() where if 'pollvm' is True then 'self._vms'
>> will get the newly create vmmDomain object.
>>
>> The problem (I think) is that under certain conditions the thread
>> spun up by self.idle_add(tick_send_signals) doesn't run quickly
>> enough (and therefore doesn't set self._vms) until after _tick() returns
>> and is called again and second vmmDomain object is created.
>>
>> At this point it appears that the first vmmDomain object collects the
>> cpu stats from libvirt while the second vmmDomain object updates
>> the display which doesn't have the stats and therefore nothing appears.
>>
>
> That analysis sounds reasonable. Since tick() is running in a thread, if the
> mainloop is handling lots of UI stuff or similar it seems possible that
> tick()
> could be invoked twice before the mainloop handles tick_send_signals.
>
> Unfortunately our threading model is pretty hacky here since we have threads
> touch a bunch of shared state without any locking, but it's simple and
> generally works out fine.
>
>> Below are a couple approaches to solving the problem (assuming
>> my analysis is correct). This first one simply yields to give the
>> tick_send_signals thread a chance to run.
>>
>
>> diff --git a/virtManager/connection.py b/virtManager/connection.py
>> index a907a3f..af27141 100644
>> --- a/virtManager/connection.py
>> +++ b/virtManager/connection.py
>> @@ -1240,6 +1240,9 @@ class vmmConnection(vmmGObject):
>> self._change_state(self._STATE_ACTIVE)
>>
>> self.idle_add(tick_send_signals)
>> + if len(self._vms) < len(vms):
>> + # Allow time for tick_send_signals to run
>> + time.sleep(.1)
>>
>> ticklist = []
>> def add_to_ticklist(l, args=()):
>>
>
> Obviously a timeout is sketchy since it might just make it less likely to
> trigger.
>
>> This second solution doesn't wait for the thread and sets self._vms
>> if pollvm is True.
>>
>> diff --git a/virtManager/connection.py b/virtManager/connection.py
>> index a907a3f..96c208e 100644
>> --- a/virtManager/connection.py
>> +++ b/virtManager/connection.py
>> @@ -1240,6 +1240,8 @@ class vmmConnection(vmmGObject):
>> self._change_state(self._STATE_ACTIVE)
>>
>> self.idle_add(tick_send_signals)
>> + if pollvm:
>> + self._vms = vms
>>
>> ticklist = []
>> def add_to_ticklist(l, args=()):
>>
>> Any thoughts on this problem and the potential solutions?
>
> We can't update any of the conn lists in the thread since it's possible the
> self._vms is in use by the mainloop, which can cause weird errors if it's
> updated while another part of the code is iterating over it or similar.
>
> I think the proper thing to do short of some larger refactoring is to use a
> threading.Lock(). conn.tick() will grab the lock at the start,
> tick_send_signals will release the Lock when it's updated the conn state. So
> tick() can't run until the previous tick_send_signals has finished.
>
> If you can reproduce easily, does this patch fix it (and not cause issues?).
> I
> only did light testing. Not pushed yet
Thanks for the patch. I have a machine that reproduces the problem easily.
I've tested your patch for both KVM and Xen and it fixes the problem without any
adverse side effects that I was able to see. I started about a dozen installs on
each (same machine booted either KVM or Xen) with a mix of ISO and network
installation sources for KVM, Xen PV, and Xen HVM. I would say the patch is
good.
- Charles
More information about the virt-tools-list
mailing list