I remember reading the paper “Virtualizing Performance Counters” by Benjamin Serebrin and Daniel Hecht some time ago. I guess that was when I was researching for the other article about selecting the MMU virtualization mode. At the time, I was wondering “when will virtual hardware performance counters be implemented?” and now here we are: 5.1 supports it. In this post, I would like to give you some information on how to use virtual hardware performance counters and how you can use them to more easily make the decision on the VMM mode.

(Virtual) Hardware Performance Counters

Hardware Performance Counters – often refered to as PMC (performance monitoring counters) describe a set of special-purpose registers a physical CPU provides in order to facilitate the counting of low level, hardware-related events that occur inside the CPU. Giving deep insight into CPU activity, PMCs are usually utilized by profilers. In VMware virtual machines until now, the VMM did not emulate PMC for vCPUs rendering profilers in VMs useless in a lot of scenarios. With ESXi 5.1 VMware eradicated this flaw introducing vPMC. For those technical folks out there, I recommend reading the paper mentioned above. It will give you good insight into implications and challenges that go along with the technology.

Requirements and Prerequisites

VMware’s KB article 2030221 gives the full list of prerequisites to meet before you can make use of vPMC. Let me summarize this for you:

  • ESXi 5.1 or later
  • VM with HW9 or later
  • Intel Nehalem or AMD Opteron Gen 3 (Greyhound) or later
  • enable vPMC in vSphere Web Client (it does not show in vSphere Client!).
  • ESXi must not be in an EVC enabled DRS cluster

Enabling vPMC for a VM enables further sanity checks before moving that VM with vMotion. The target host must support the same PMCs, too, otherwise no vMotion will be possible.

Freeze Mode

The freeze mode is an advanced VM settings that allows to specify when PMC events should be accounted for to the vPMC value. During runtime of a VM instructions can be executed …

  • directly be the guest or
  • by the hypervisor on behalf of the guest.

The freeze mode defines which hardware events should be accounted for in which of the above situations.

  • guest: In this mode, events increment during direct guest execution only.
  • vcpu: Hypervisor instructions on behalf of the VM are accounted for, too
  • hybrid: Events counting the number of instructions and branches are handled as if in guest freeze mode. Otherwise, vcpu freeze mode behavior is used.

If you wish to change that behavior, set the vpmc.freezeMode option to “hybrid”, “guest” or “vcpu” (hybrid is default).

Reading Metrics in Linux

Counting hardware performance metrics with linux is quite easy. On Debian Squeeze install the “perf” tool by executing the following line:

apt-get install linux-base linux-tools-`uname -r`

Perf is quite a powerful tool for profiling and such tools are usually just as complex to use. A good overview of the command’s usage can be found here. To get a full list of PMCs suuported by perf type

perf list

Not all counters listed might be supported by HW9 though. To analyze a certain process for a specific event use

perf stat -e EVENT COMMAND

where EVENT is the name of the event as listed in “perf list” and COMMAND is command to execute to create the process to monitor.

Using vPMC for Selecting the MMU Virtualzation Mode

Recently, I posted an article about selecting the proper MMU virtualization mode for a certain application. It suggested to conduct an isolated experiment with the VM in question and to use vmkperf to monitor the number of TLB misses. This procedure is quite cumbersome as “isolated experiment” means to set up a dedicated ESXi host with a single VM and the application to monitor just for the testing, which might not even be possible if the application is already running productively.

Luckily, vPMC allows access to the TLB miss event! That means we can now perform the testing of a VM sharing an ESXi host with other VMs! That gives us a much easier way to find out whether we need to change the MMU virtualization mode back to shadow page tables:

With nested paging enabled:

root@minisqueeze:~# perf stat -e dTLB-load-misses perl tlbstress.pl
Performance counter stats for 'perl tlbstress.pl':
68456568 dTLB-load-misses # 0.000 M/sec
60.072490066 seconds time elapsed
root@minisqueeze:~#

With shadow page tables enabled:

root@minisqueeze:~# perf stat -e dTLB-load-misses perl tlbstress.pl
Performance counter stats for 'perl tlbstress.pl':
38750437 dTLB-load-misses # 0.000 M/sec
59.116399919 seconds time elapsed
root@minisqueeze:~#

The script executed is the same one already used in the other article, but this time we measure the TLB miss count from within the VM using perf. The results show the same: a much higher number of TLB misses using nested paging. The consequences of this were already explained in the other article.

Unfortunately, I could not quickly find an easy way to monitor the same counters on Windows systems. I will keep looking and keep you posted as I understand Windows is the more commonly used platform out there – as much as I hate admitting icon wink vPMC   Virtual Performance Monitoring Counters

Acknowledgements

Thank you, Thomas Glanzmann, for providing recent hardware to test this new feature!

Resources