AuthorFabian Lenz

vSphere and CPU power management: performance vs. costs in the VDI field

Sometimes I love being an instructor. 2 weeks of Optimize and Scale and finally I have more valid and realistic values from 2 participants of mine regarding  performance vs. power usage.

First of all thanks to Thomas Bröcker and Alexander Ganser who were not just discussing this topic with me, but also did this experiment in their environment. First of all I am proud that it seems that I have motivated Alexander to blog about his findings in English :-). While he is focusing in his post on hosting server applications on Dell/Fujitsu hardware (-> please have a look at it), I will extent this information by using data from a HP-based VDI environment, where the impact on performance, power-usage and costs were much higher than I have expected it.

The trend of green IT not just had an effect on more effective consumer CPUs, it is also getting more and more a trend in modern datacenters. Hosts are powered down and on automatically (DPM – productive users of this feature please contact me 😉 ), CPU frequencies are dynamically changed or cores are disabled on demand (core parking). Since I always recommend NOT to use any power management features in a server environment, I am now following up this topic by giving suitable and realistic numbers from a production environment.

A few details about the setup and the scenario I am going to talk about. For my calculations later on I selected a common VDI size of around 1000 Windows 7 virtual machines.

VDI: 1000

Number of ESXi (NoESXi): 20 (vSphere 5.5 U2)

CPU-type: 2x Intel Xeon E5-2665 (8 Cores 2.4 – 3.1 Ghz – TPD – 115W)

vCPU per VM: 4 (pretty high for regular VDI – but multimedia / video capability was a requirement ( by avg. 80% of the VDI have active users)

vCPU / Core rate: 12.5

A few comments to the data. Intranet video-quality was miserable with the initial VM sizing (1 vCPU). We took a known and approved methodology of the 2 performance affecting dimensions:

  • 1st dimension: Sizing of a virtual machine (is the virtual hardware enough for the proposed workload?) – verified by check if the end-user satisfied with the performance (embedded videos are working fluently).
  • 2nd dimension: Sharing of resources (how much contention can we tolerate when multiple virtual hardware instances (vCPU) shares the physical hardware (Cores) – verified by defining thresholds for specific ESXi metrics.

As a baseline approach we defined that an intranet video needs to run fluently and ESXTOP metrics %RDY (per vCPU – to determine a general scheduling contention) and %CO-STOP (to determine a scheduling difficulty because of the 4vCPU SMP) were not reaching a specific threshold (3% Ready / 0% CO-STOP) during working hours. *

// * of course we would run into a resource-contention once each user on this ESXi host is going to watch a video within the virtual desktop resulting a much higher %rdy value.

So far so good. The following parameters describe dependant variables for the power costs of such an environment. Of course the used metrics can differ between countries (price for energy) and datacenter type (cooling).

Power usage per host: This data was taken in real-time via iLO HP DL 380G8 and describes the current power usage of the server.  We tested the following energy-safer settings (Can be changed during runtime and has a direct effect):

  • HP Dynamic Power Savings

  • Static High Performance Mode

Climate factor: A metric defining how much power is effort to cool down the IT systems within a datacenter. This varies a lot for different datacenter and I am referring as a source to Centron who did an analysis in German with an outcome that the factor is between 1,3 and 1,5 which means that for 100 Watt used by a component we need 30/50 Watt for the cooling energy. The value I will take is randomly taken as 1,5 and can differ a lot in each datacenter.

Power Price: This price will differ the most in each country depending on the regulations. The price is normed as kilo Watt hour, means how much do you pay for 1000 Watt power usage in 1 hour. Small companies in Germany will have to pay around (25 Cent per kWH), while large enterprises with a huge power demand pay much less ( around 10 Cent per kWH)

Data was collected during a workday at around 11 AM – Friday. We assume that the data is taken during a regular office-hour workload.

Avg. power usage per host Power Savings (PU-PS) = 170 Watt = 0,170 kW

Avg. power usage per host High Performance (PU-HP) = 230 Watt = 0,230 kW

Price per kW in an hour (price) = 0,25 Euro

climate factor (cli-fa) = 1,5

so let’s take the data and do some calculations based on the VDI-server data mentioned above:

VDI – Power-costs per year = NoESXi * (price * PU-XX * 24 * 365) * cli-fa

Power-Costs per year Power Saving mode = 20 * (0,25Euro/W * 0,17W *  24 * 365) * 1,5 =11169Euro

Power-Costs per year High Performance mode = 20 * (0,25Euro/W * 0,23W * 24 * 365 ) * 1,5 = 15111 Euro

11169 Euro vs 15111 Euro a year (for the power of around 1000 VDIs)

The result of the power-saving mode is very high/aggressive in a VDI environment and is far less when the ESXi host is used for server virtualization (I refer back to the blog post of Alexander Ganser since we observed nearly the same numbers for our serers). Server virtualization has a higher constant CPU-load while VDI workload pattern is much more infrequent and gives a CPU more chances to quiesce-down a little bit. We observed around 10% power-savings in the server field.

So now let’s get a step ahead and compare the influence of the energy-saving mode for the performance.

  • HP Dynamic Power Savings: CPU Ready avg of 2% per vCPU (=400ms in Real-Time charts)

  • Static High Performance Mode: CPU Ready avg of 1% per vCPU (=200ms in Real-Time charts)

PowerSaving-ESX

PowerSaving-ILO

As you can see the power usage has a direct impact on the ready values of our virtual machines vCPU. At the end of the day the power-savings have a little financial impact in the VDI field, still I always recommend deactivating ALL power-saving methods since I always try to ensure the highest performance.

Especially in the VDI field with irregular sudden CPU spikes the wake-up / clock-increasement of the Core takes too much time and if you read through the VMware community on a regular basis you will see that a lot of strange symptoms are very often resolved by disabling energy-saving mechanisms.

Please be aware that those numbers may differ in your environment depending on your server, climate-factor, consolidation-rate, etc.

Automatic virtual hardware change after shutdown – vCenter alarms and #PowerCLI

Teaching an Optimize and Scale class this week I was asked if there is chance to automatically change the virtual hardware once a VM has been shut down or powered off.

Scenario – Customer wants to change the CPU / RAM of their XenDesktops with minimal impact on the end-user. If a VM is powered off vCenter should change the hardware and wait until another components powers the VM on again.

At first I was not sure how this can be done. Using PowerCLI for changing the virtual hardware? – easy… but how can we call this script once the VM is powered off? After a minute I remebered that we can use vCenter alarms to trigger more than just snmp/email actions if a specific Event/Condition occurs. We can also run cmds and therefore call a PowerCLI script.

Requirements:

Using AD-based authentication for the PowerCLI script makes it easier regarding the Authentication mechanism. Therefore the vCenter server must run with Active Directory Service Account that has the proper permissions on the vCenter level

vcenteralarmaction01

vcenteralarmaction02

Chapter 1:  Easy going – Hardcode the CPU count in the script

Create the PowerCLI script

Now we create the modify-cpu.ps1 script that will be stored in the C:\scripts\

with the following content. (Note: the CPU count must be hardcoded in the script by changing the $numCPU parameter. Be aware of that this script changes the count of Cores and stays with 1 virtual socket.

Complete Script – modify-cpu.ps1:

param(
[string]$vmname
)

$vCenter = 'localhost'

$numCPU = 4
####
# Include VMware-PowerCLI SnapIn
if(!(Get-PSSnapin | Where {$_.name -eq "vmware.vimautomation.core"}))
{
    try
    {
        Write-Host "Adding PowerCLI snap in"
        Add-PSSnapin VMware.VimAutomation.Core -ea 0| out-null
    }
    catch
    {
        throw "Could not load PowerCLI snapin"
    }
}

Connect-VIServer $vCenter

$vm = get-vm $vmname
$spec=New-Object –Type VMware.Vim.VirtualMAchineConfigSpec –Property @{"NumCoresPerSocket" = $numCPU;"numCPUs" = $numCPU}
$vm.ExtensionData.ReconfigVM_Task($spec) | out-null

Disconnect-VIServer $vCenter -confirm:$false

 

Create the vCenter alarm that will call the script above with the VM name as parameter

Select the VM you want the virtual CPU count be changed after a shutdown and create a vCenter alarm (wow I am really using the webclient)

Give it a valuable name and describiton and select to monitor for a specific event

vcenteralarmaction03

As an event where the alarm and therefore the action will be triggered we select VM – Powered off

vcenteralarmaction04

And finally as an action we call our created PowerCLI script in C:\scripts with the following statement:

“c:\windows\system32\cmd.exe” “/c echo.|powershell.exe -nologo -noprofile -noninteractive C:\scripts\modify-cpu.ps1 {targetName}”

With the {targetName} variable we can transfer the name of the virtual machine that caused the Trigger.

vcenteralarmaction05

And voila. If the VM is now getting powered-off -> the change of the virtual hardware will be done.

vcenteralarmaction06

(German language, hm!? Doesn’t sound it quiet nice :P)

vcenteralarmaction07

If the configuration is not working as expected, validate the functionality by call cmd with the user that is running the vCenter service:

> Open CMD

> runas /user:Domain\vCenteruser cmd

> “c:\windows\system32\cmd.exe” “/c echo.|powershell.exe -nologo -noprofile -noninteractive C:\scripts\modify-hardware.ps1 {targetName}”

Chapter 2:  Make it more awesome by using VMs and Templates Folders

To be more flexible on the hardware configuration I wanted to have the following functionality:

Drag and Drop a VM in a specific folder. If the VM is powered off change the virtual hardware based on specific settings. The settings should be extracted from the folder name.

Foldername specification: modify_ressourcetype_value

for example I create 4 Folders:

  • modify_cpu_2
  • modify_cpu_4
  • modify_ram_4
  • modify_ram_8

If you drag a VM in the modify_ram_8 folder and power it off, the VM will be configured with 8GB memory.

So I needed to change my script in a way that it gathers the folder the of the VM that is transmitted with the script call:

Complete Script – modify-hardware.ps1:

$vm = get-vm $vmname
$vmFolderName = $vm.Folder.Name

and split the foldername at the ‘_’ chars:

$items = $vmFolderName.Split('_')
$ressourceType = $items[1]
$amount = $items[2]

Now I can select the change RAM or CPU logic with a ‘switch’ statement:

switch($ressourceType){

    'cpu'
     {
         $vCPu= $amount
         $spec=New-Object –Type VMware.Vim.VirtualMAchineConfigSpec –Property @{"NumCoresPerSocket" =          
         $vCPu;"numCPUs" = $vCPu}
         $vm.ExtensionData.ReconfigVM_Task($spec) | out-null
      }
    'ram'
     {
          $NmbrRam = $amount
          set-vm -VM $vm -memoryGb $NmbrRam -confirm:$false
     } 
     default
     {
         write-verbose 'Wrong folder name specification'
     }
}

If the VM is not in a folder that matches the specification. Nothing will happen.

Transfer the following modify-hardware.ps1 script into C:\scripts\ and configure this time the alarms in the same way as described above – but now on the specific folder we have created for this solution.

The action call this time needs to be renamed since I created another script:

“c:\windows\system32\cmd.exe” “/c echo.|powershell.exe -nologo -noprofile -noninteractive C:\scripts\modify-hardware.ps1 {targetName}”

vcenteralarmaction08

Complete Script – modify-hardware.ps1:

param(
[string]$vmname
)

$vCenter = 'localhost'

###

# Include VMware-PowerCLI SnapIn
if(!(Get-PSSnapin | Where {$_.name -eq "vmware.vimautomation.core"}))
{
    try
    {
        Write-Host "Adding PowerCLI snap in"
        Add-PSSnapin VMware.VimAutomation.Core -ea 0| out-null
    }
    catch
    {
        throw "Could not load PowerCLI snapin"
    }
}

Connect-VIServer $vCenter

#Gather Objects and Data
$vm = get-vm $vmname
$vmFolderName = $vm.Folder.Name

#Split the folder name to extract parameter - Foldername must be modify_ressource_x
#while ressource must be ram or cpu and X the number of CPU or amount of RAM in GB

$items = $vmFolderName.Split('_')
$ressourceType = $items[1]
$amount = $items[2]
$items

switch($ressourceType){

    'cpu'
     {
         $vCPu= $amount
         $spec=New-Object –Type VMware.Vim.VirtualMAchineConfigSpec –Property @{"NumCoresPerSocket" =          
         $vCPu;"numCPUs" = $vCPu}
         $vm.ExtensionData.ReconfigVM_Task($spec) | out-null
      }
    'ram'
     {
          $NmbrRam = $amount
          set-vm -VM $vm -memoryGb $NmbrRam -confirm:$false
     } 
     default
     {
         write-verbose 'Wrong folder name specification'
     }
}

Disconnect-ViServer $vCenter -confirm:$false

Maybe some of you will benefit from this little script-collection. Have fun with it 😉

IMO: Is SMP fault tolerance even useful? My view on it!

Maish Saidel-Keesing has written a post about the fault-tolerance topic with multiple vCPUs a few weeks ago. He has valid points in his argumentation, but anyway I want to give you a little bit of my view on this topic (IMO).

With fault-tolerance two VMs are running nearly symmetrical on 2 different ESXi hosts with one (primary) processing IO and the other one dropping it (secondary). With the release of vSphere 6.0 VMware will support this feature with a VM of up to 4vCPU and 64 Gbyte memory. [More Details here]

I try to summarize the outcome Maish’s argumentation:

FT is not the big deal feature since it only protects against a hardware failure of the ESXi host without any interruptions in the service of the protected VM. It does NOT detected or deal with a failure at Operating Systems and Application level.

So what Maish think we really need are cluster-mechansims on application level and if legacy applications don’t.

I would in general not disagree with this opinion. In an ideal world all applications would be stateless, scaleable and protectable with a load-balancer in front of them. But we will need 1X or more years until all applications are built in such a new ‘modern’ way. We will not get rid of the legacy applications in the short-term.

Within the last 4 years of beeing an instructor I received one questions nearly every time when delivering a vSphere class:

‘Can we finally protect our SMP-VMs now with Fault Tolerance? No?! Awww :(‘

So I would not say there is a not a need out there for this feature. Being involved in some bidding last year we had very often the requirement to deliver a system for automation-solutions within large building-complexes (airports, factories, etc.).

Software being used in such domains are sometimes legacy application par excelente (ironic) programmed with a paradigm long before agile/restful/virtualization played a role in the tech-world.  Sometimes you can licence a cluster feature (and pay 10 time as much as for a 1-node licence) – sometimes you can’t cluster it and need other ideas or workaround to increase the availability.

Some biddings were not won because of opponents who where able to deliver solutions that can (on the paper) tolerate an hardware outage without any service-/session impact.

For me with SMP-FT typical design-considerations come into play:

  • How does the cluster work? Does it work on application/OS-level or does it only protect for a general outage?
  • What were failure/failover reasons in the past? (e.g. vCenter – in most cases I had a failure here it was because of Database problem [40%], Active Directory / SSO problem [10%], a hardware failure [45%] or rest [5%])  -> A feature like FT would protected against a huge amount of failure experienced in the past. Same considerations can be taken into account for all kind of applications (e.g. virtual load-balancer, Horizon View Connection Server etc.)
  • How much would a suitable solutions cost to make, buy or update?

Sure we need to get rid of legacy applications, but to be honest… this will be a very long road (the business decides and pays it) and once we have gotten to the point where the legacy applications are gone – the next generation of legacy applications is in place that need to be transformed (Docker?! 😉 ).

We should see FT as it is. A new tool within our VMware toolkit to fit specific requirements and protect VMs (legacy/new ones) on a new level with pros- and cons (as always). IMO every tool / feature that gives us more opportunities to protect the IT is very welcome.

Vote Now!!! – Top VMware blog voting has just started

As every year Eric Siebert spent a lot of effort to create and maintain an index about all VMware blogs within the community. And as every year Eric calls all friends of VMware and Virtualization to vote for the top VMware blogs in the world wide web.

I know that we won’t get many votes with our blog here @vxpertise.net, but maybe there is one or another who liked the topics we have been talked about so far  – or enjoyed the conversation with us over a certain topic (Google analytics is showing me that we have some useful information people from all over the world are spending time with).

———————– Vote HERE! —————————-

http-_tompeyer.tumblr.com_500

So if you came here by accident or if we were able to help you sometimes by delivering information within our text && you have 1 vote left besides all of the great writers out there (Duncan Epping – yellowbricks, William Lam – virtuallyghetto, Frank Denneman, Scott Lowe, Derek Seaman, Cormac Hogan, Eric Sloof,  Manfred Hofer and many many many more – wow the more I think about great bloggers in the community the more I vRealize that every single vote will be a success.

Anyway – Thanks a lot so far and since my new lab environment is very close to my ‘office’ I am pretty sure you will find a lot of new and hopefully useful stuff about VMware NSX & vSphere 6.x topics in the next months.

 

 

 

VMware Update to vSphere 5.5 and Horizon View 6.0 – vCenter service not working properly

A few days ago I received a mail of a former student of mine. They have updated their VMware environment to the latest vSphere 5.5U2 and afterwards Horizon View from 5.2 to 6.0.

From a procedural point of view it has seemed that everything has worked fine. But on a second look he has realized that in the Horizon View Manager dashboard the vCenter was marked red (‘service is not working properly ‘) and pool operations were not working anymore.

vCenter service not working properly

From a systematic troubleshooting perspective I recommended him to check that the connectivity between the Connection and Server was doing fine. OSI Layer 1-4 were working well (ports haven’t been changed as well between the VMware versions). For the connectivity check of layer higher than 4 I told him to check the ‘classical-access-logs’ to see a problem with the authentication.

%ProgramData%\VMware\VirtualCenter\vpxd.log
%ProgramData%\VMware\CIS\IMStrace.log

%ProgramData%\VMware\VDM\logs\*.log

and to verify that the service-account has proper vCenter access and the correct permissions set within a role.

And voila –> the service user’s vCenter permission was removed during the upgrade (-> All other permissions were still in place).  Maybe a malfunction during the SSO / AD-LDS upgrade. Unfortuneatly I am not able to have closer look to do a root-cause analysis of it.

Anyway! If you observe similiar issues –> a) Use a systematic approach to verify system-communication or b) check directly the vCenter permissions.

A generic IT-Infrastructure operations manual model – A place to start

Last year I spent a lot time at the operations side of life. Bringing new IT-systems into a productive and operational state is a pretty interesting and challenging topic. Since I am a freelancer I need to get very often a pretty quick insight into new environments. And the first touching points are always important documents like the design document and ……

The operations manual

Having such a document has a lot of benefits:

  • Given a first insight to new members of the Operations team (IT- and Operating has a high fluctuation, hm?)
  • Offers a change to neutrally audit/review operations tasks by 3rd parties
  • Having all necessary information to operate a system in 1 document (yeah I know… the design documentation would be veerrry beneficial as well)
  • …………… so many more (please comment)

Even though many companies struggle to create one (sure it costs time and therefore money) I will try to give a good starting point with the following operating model I have created. This model can be used for creating a new operations manual from scratch or if you just want to audit your existing manual. Please be aware of that this a generic model and is not specific to certain environments.

I don’t claim that I know and included every important thing that must be included in such a manual… so feel free to give me feedback and I will update the document accordingly (if the discussion is leading to a conclusion which let the model evolve and bring more benefit to all of us).

I divided the model in 3 different sections that must be in each operations manual.

  1. General information: Which IT-Service is delivered? Which communication channels are used? Which persons are important for the operating and during escalations?
  2. Functional tasks / requirements of an ops-manual (Does anyone has a better wording to describe those ?) : Concrete tasks and information that are done or used by operators/administrators to keep the basic functionality of the IT-systems.
  3. Non-Functional tasks : Tasks to ensure performance and availability of the solution. Those tasks ensure the quality of the environment and are typically separated in two phases – detecting and acting (e.g. failure & recovery, performance problem & fix). IMO those are the tasks that are really important to grow from a pure cost-driver within a company to a service-provider. A lot of organizations are having structured methods for detecting, but missing a well a structured process afterwards.

vXpertise_Operating_Model

At the end of the day I believe that each operations manual should give information about the mentioned items.  Having a structured document with all of those information of the environment separates boys from men (from an organisational/maturity point of view 😉 ) –> so let’s grow up, create one and give me feedback about your experiences with those type of documents.

VMware EVO:RAIL: How Vendors differentiate (#IMO edition)

During my last #IMO on EVO:RAIL I asked myself the question how vendors are going to differentiate on those standardised / hyper-converged solutions:

  • pricing: Make or Buy – the good old discussion within the economics hits us again. EVO:RAIL Vendors will try to figure out how the market will react on the hyper-converged solution. evo:rail_apex The implementation (cost-) block is always aligned with a certain risk level (bad external service provider, human mistakes). This risk is somehow mitigated by using the EVO-engine for the implementation/configuration task. But since the hyper-converged market is pretty new our vendors and their shareholders are expecting a higher margin out of it. The vendors will learn their lesson and their (EVO) margin will decrease within the next years. Howard Marks did a nice analysis on the pricing part of EVO:RAIL. Let’s wait how the real-price will differ from the list-price.
  • vendor-support
  • vendor specific software and bundles: This is a very interesting topic, since a lot of companies currently try to bundle EVO:RAIL with specific Software packages (for Management/Monitoring) or even hardware (Storage). I try to summarise the main differences within the next lines according to the current public information.

Dell

Dell has announced to bundle its EVO:RAIL solution with the software-defined storage (SDS) solution of NexentaStor. NexentaStor offers a storage solution based on the ZFS filesystem.

So the big question is, why combining VMware’s SDS vSAN with a another SDS solution? The Nexenta part should not be seen as a substitute for vSAN. It offers more like an added functionality by offering NFS (v3,v4), SMB, Snapshot, Deduplication solutions integrated into the vSphere Management via NexentaConnect. This might IMO be a useful extension for very specific use cases.

Besides to the Nexenta integration DELL has announced to offer a VDI package for EVO:RAIL (haven’t I asked for it? 😉 ). If this solution will include VMware Horizon View or DELL’s vWorkspace (QUEST) into it is not been officially been announced.

Functional Advantage Level (none-low-medium-high): medium

NetApp

It was kind of a surprise when NetApp has announced that they are going to provide EVO:RAIL solutions as well. NetApp bundles their FAS solution (based on ONTAP) within the EVO:RAIL solution and offers new storage capabilities to the vSphere environment. As with Nexenta you will be able to extend the EVO:RAIL functionality with features like NFS, SMB, Dedupliaction, etc. But not like Nexenta NetApp is integrating a dedicated FAS-unit into the EVO:RAIL solution.

The big question is: will it bring real benefit to us if we now need to manage two storage systems (ok vSAN doesn’t need to be managed that often) while at the same time the price for the FAS solution will most certain be somehow added to the customer in the end. Without any concrete use-cases I am not that sure if customers are willing to pay a higher amount for the NetApp EVOs

Functional Advantage Level (none-low-medium-high): medium

EMC

The EVO:RAIL solution of EMC will be based on EMC’s Phoenix Foundation (Didn’t MacGyver worked for them?) . To differentiate themselves EMC is planning to integrate/bundle their own Data Protection Software (vSphere Data Protection Advance (integration with EMC’s Data Domain?) and/or Recovery Point as a Desaster Recovery technology).

Functional Advantage Level (none-low-medium-high): low

Fujitsu

Fujitsu’s EVO:RAIL are based on the CX400 S2 nodes which are advertising themselves with a higher temperature tolerance. According to the official notes the CX400 S2 has an ambient temperature between 10-35 degrees Celsius, which value might increase to 40 degrees Celsius within the next EVO:RAIL generation. This is definitely an interesting approach, since reduced cooling costs within a datacenter are always welcome.

So we see Fujitsu is trying to differentiate themselves with an improvement of their x86 nodes. I am honestly not sure how much money a company will safe in the end by using Fujitsu hardware. Even if I would know that my hardware tolerates higher temperature, I am not that sure if I would decrease the overall temperature within my datacenter.

Functional Advantage Level (none-low-medium-high): none

HP

HP’s is attacking the hyper-converged market with 2 solutions called HP’s ConvergedSystem 200. Based on the same components one version is an EVO:RAIL solution, while the other version based on HP’s own scale-out storage solution Virtual Storage Appliance (VSA).

HP tries to differentiate to the other vendors by integrating EVO:RAIL into their OneView management solution. As a consequence EVO:RAILs can be managed in the same way as the other HP components.

Functional Advantage Level (none-low-medium-high): low-medium

For the following companies I have not found/received further information so far. I will update the post as soon as I have new information on it.

INSPUR

Inspire is the first partner for VMware EVO:RAIL in china. Until now I was not able to gather any further information about the way the way INSPUR will differentiate in the existing eco systems. But since the market in China is more regulated than others, a differentiation is not even that necessary.

Hitachi

More Information are hopefully coming

net one

More Information are hopefully coming

SUPERMICRO

More Information are hopefully coming

 

Comparing just the functional benefit of each EVO:RAIL vendor I don’t believe that those functionalities will convince customer to pay additional buckets. IMO Vendors will need to work harder on bringing functional benefit to their own EVO:RAIL solution.

In the end I believe that the existing vendor relationship and the pricing will be the most important factor. And this is exactly something the year 2015 will show us how the market will react.

 

VMware VCAP5-DCD (Datacenter Design) – Exam experience and learning philosophies

To be honest, I didn’t wanted to write a blog post about my VCAP5-DCD exam experience, since there are soooo many good articles and posts already online. Anyway, a lot of people were asking me what resources I used to be prepared for this exam to achieve the ….

vcap5-dcd

First of all… thanks a lot to everyone else who has created posts about their experiences. I think I read all of the existing posts about the VCAP5-DCD exam.

The following content is structured in the following way. If you are just interested in real facts… please go directly to the resources part 😉

  • What type of ‘learner’ am I
  • Why I did the exam
  • What resources are useful to pass the Exam
  • Personal hints and tipps on doing the exam

What type of ‘learner’ am I?

I don’t even know if the expression learner exists in the English language. Anyway I believe that everyone who is extending his knowledge needs to find out HOW he is learning in the best way. I don’t just focus here on VMware, I try to be as general as possible in the following description.

My school career showed me, that I am not a good learner in a traditional way.  If I have to read book with pure theory where I don’t have any practical relevant relationship I cannot focus on more than 2 pages. Even if I read 50 pages out of the book my mind was only really active for the first 3 minutes.

So what does this mean? I personally need a practical relationship to some ‘issues’/’events’ I experienced in my life. This experience can derive from the following:

1. having experienced something in the real world (“live the challenge”) -> Maximum personal involvement, but pretty expensive (time/cost consuming)

2. talking and listening to someone who had an experience of something similar (“feel the challenge”) -> Medium personal involvement, but it might be hard to find the right people in the correct domain (e.g. User groups, conventions, tech talks, vBeer, …)

3. reading from someone who had an experience (“read the challenge”) -> personal involvement is low, inexpensive within the world wide web.

The more personal an information is absorbed by me, the more I realize the challenge, the better and more attentive I can read/learn new things.

This is exactly the reason why I love technical blogs. People are describing things more concrete and related to their experiences in a very personal way. This is something a technical documentation or book typically does not (of course there are exceptions).

Technical documentations and books are pretty good resources and very important as well, but in my case I need a personal experience first and afterwards I can read the technical documentation much more attentive with much more take-aways, since I am than aware and can think about concrete usage of the information.

Another important thing for me is the following. If I am confronted with a lot of new information over multiple days in a specific time (Web-ex, classroom teach, breakout-sessions) I personally need a few weeks to handle all this data.

When I am ‘attacked’ by a lot of information which I was not able to process (e.g. during the class) I need a break from those topics. Even though I am not mentally- and active working on those things my brain seems to make progress on the data subconsciously (‘excuse me for the non-scientifical correctness’). And in many cases suddenly something is happening with me that I call ‘illumination’ … everything out of nothing makes totally sense  from one second to another (‘no joke, from time to time some mathematical facts I have never understood in school are suddenly illuminated in my mind 😉 you see it might take a verrry long time…. next step…  find out to accelerate the illumination phase).

As a third thing important fact it is mentionable, that I need pressure. Without time pressure, my efficiency is typically decreasing a lot.

To summarize it all…. What do I personally need to extend my knowledge in the best way? Personal involvement AND time for the illumination….

So let’s see how this works all out for the VCAP-DCD exam.

Why I did the exam

The why is always important. During my career I have met so many people with all kind of environments, I worked in a lot of projects and talked to so many experts and there is one thing I realised pretty soon.

It is incredibly important to have a good architectural design of an IT system. And it is so easy to screw IT systems up if you don’t do it right. Since I am working in the IT field as a professional (and not only as a geek/nerd who loves technologies) I was always impressed about people working in a very structured/methodological way. Today (until SKYNET rises) IT systems are supporting people AND/OR businesses. This leads to the fact that an IT system needs to align to a business. If you only look at an IT system as a summary of technical best practice you will probably have a great technical solution, but it will not be the best solution for a business itself.

The idea of creating/collecting business requirements and design/transform these information (and probably even implement them) into a solid technical solution is in my opinion a skill every architect should be capable of.

I knew from several discussions, blogs, books that VMware’s highest certification (#VCDX) is exactly about approving this skill set. Since the VCDX is still a long term goal for me, the VCAP-DCD exam was the right one to take.

So I decided to take the VCAP-DCD exam in the beginning of 2013. And I found so many good excuses to postpone it month for month since than (Projects, Master thesis, …). Since I am only focussing on VMware in my job I have already read the most common VMware literature that is recommend for any kind of vSphere related exam (VMware vSphere Design, Clustering Technical Deep Dive, …). I was often involved in Design tasks/creations within my job (projects/trainings/discussions) so the exam preparation was kind of long-term preparation with many situations where suddenly the (knowledge-) illumination has kicked in (ILLUMINATION).

As time was passing in August 2014 I was giving myself a deadline that I MUST pass the VCAP-DCD until December 2014 (PRESSURE).  So I started to learn more concrete to the blueprint…and to be honest… it was a real exciting AND effective way of learning since I already had the personal involvement and practical relationship during all of these years.

What resources are useful to pass the Exam

Now I am getting concrete about the resource I used to learn and pass the exam.

  • Exam blueprint : First of all – as for each VMware exam, the exam blueprint is the baseline for each kind of certification. I decided to take the 5.1 version because of the fact that I have worked almost one year in a very large vSphere 5.1 environment and requested the exam authorization somewhen in 2013.
  • Clustering Technical Deepdive & vSphere Design : IMO those 2 books are a compulsory reading if you want to extend your knowledge in the vSphere field.
  • #vBrownbag VCAP5-DCD Video sessions : When I met Alistaire Cooke (a big contributor of vBrownbag), during the vRockstar party at VMworld I was not aware that 1 week later I the video sessions will have a large portion   of my successfully passed exam. In those video session very good experts are talking about every objective of the exam blueprint. A must watch for everyone who wish to take the VCAP-DCD exam. Nick Marhshall has collected all Video parts together on his blog . Personally I have only focussed on those topics where I thought that I have the least knowledge.
  • VMware VCAP-DCD51 Doc package: Jason Langer created a great document about VCAP-DCD relevant documents and structured them according the exam blueprint. Most of the files are official VMware documents, that are probably out-dated now if you want to take the 5.5 exam. But anyway since the design methodologies (Paper about differences of conceptual vs. logical vs. physical design) are not pinned to a specific version it is a really useful resource as well.
  • VMware Best practices Technical Whitepaper
  • Blogs Blogs Blogs about VCAP-DCD : I will not be able to mention all I have read, but just google VCAP-DCD and you will find a lot of entries. Everyone has made different experiences with the exam… some are focussing on the timing, on the technical challenges in the exam or how they have learned for it. Just a few links I had in my bookmark list:
  • Gather hands-on design and technology experience

I know a lot of architects, telling me that an architect must not know too much technical details about their solution. Honestly I agree to a specific point of the view that an architect MUST not know every configured item in the physical design by heart. But at the same time I believe am sure the more detailed knowledge an architect has, the better his design decisions will be. So it is a big advantage if you want to study for the DCD exam that you are professional on a vSphere Operational/Administrational point-of-view. I would recommend everyone to do the VCAP-DCA exam first. It makes life and the learning much easier if you are very familiar with the technical details of a vSphere environment.

Personal hints and tipps on doing the exam

Doing a design exam is hard to learn for, since this something where the personal experience plays an important role. If you have worked in the VMware design field for a specific time, you know the technology very well and you want to improve this knowledge, define a concrete time frame when you want to do the exam. One month before the exam, start to use the resources mentioned above. Read them, understand them, try to think how these design methodologies, technical best practices would have changed your past projects.  Try not just to learn those things, understanding and illumination is the key topic to successfully pass the exam. The DCD is an exam where you always will be up to a point that you do not know everything.

One more hint… stay calm…the technical implementation especially of the design parts where you drag and drop items into a logical design is pretty bad… The flash application was hanging up 3 times during my exam and I needed to talk all the time with the pearson administrator so that he was able to restart the system and test. Those things are pretty pretty annoying…but IMO if you can’t change things, make the best out of it…. stay calm..don’t get nervous and take a mental break.

Make sure that the items are connect to each other (if you move one item all other connected items should move as well) BUT DON’T try to mark all of them in the end… the system always crashed in my case…

I am not sure if that was an unlucky accident in my exam or is fixed in the 5.5 version… but I would not risk it anymore.

So everyone who is going to go the VCAP-DCD journey… good luck and if you want more information about it, feel free to ask me.

 

Killing me softly/hardly/forcely. How to kill a VM via #PowerCLI

Having observed some problems with VMs that were not able to be shutdown/powered-off properly via PowerCLI I tried to find a solution.

From time to time Shutdown-VMGuest didn’t worked and even an Stop-VM with the kill option were not working as expected. I knew that ESXTOP and ESXCLI have the options to kill a VM process/world if there are no other options. But since I wanted to achieve this in PowerCLI this blog post from the year 2011 gave me the correct hint.

We can use ESXCLI via PowerCLI to fulfil that task  😉 *whoopwhoop*.

I was missing a feature to kill those worlds without authenticating directly against an ESXi-host and since ESXCLI and it’s namespaces have changed a little within the last years I wanted to document now how this can be achieved in vSphere 5.X.

First of all connect to the vCenter via

Connect-VIServer $vCenterFQDN

Get the VM-Object you are going to kill

$vm = Get-VM $nameOfTheVM

find the ESXi-Host where the VM is running on

$esxi = Get-VMHost -VM $vm

and load the ESXCLI functionality

$esxcli = Get-EsxCli -VMhost $esxi

Now it’s time to extract the WorldID out of the ESXCLI VM PROCESS LIST data

$worldid = $esxcli.vm.process.list() | where{$_.Displayname -eq $hostname} | Select WorldID

and kill the VM with the options soft, hard, force

$esxcli.vm.process.kill("force",$worldid.WorldID)

VOILA the VM should be definitely killed right now. This ESXCLI commands is  not being tracked by the VPXA, so no events of the ‘kill’ are written down in the database. (With great knowledge comes great responsibility, right? ;-))

If you are running this command against a VM as part of an HA-Cluster. The HA-mechanism will reboot the VM after the kill. In this scenario you need to disable the HA-protection of the VM (so it is removed from the HA protected list) before you are going to kill it via.

$vm | Set-VM -HARestartPriority Disabled

I hope this information might be useful to some of you guys.

Please use the Code-Snippet here to see the fully-functional (Kill-VM.ps1) script.

Kill-VM.ps1

 

IMO: #VMworld 2014 recap Automation & Orchestration (part 5)

Sitting here at the airport in Bucharest I thought I can finally write down my IMO thoughts about the whole automation/orchestration topic.

As I had more fun in writing about automation instead of vSAN/vVol I did it like George Lucas and mixed the orders of my parts/episodes 😉

IMO: #VMworld 2014 recap on VMware NSX (part 1)

IMO: #VMworld 2014 recap VMware EVO:RAIL (part 2)

IMO: #VMworld 2014 recap VMware vCloud Air (part 3)

IMO: #VMworld 2014 recap vSAN and vVol (part 4)

IMO: #VMworld 2014 recap Automation & Orchestration (part 5)

I visited a lot of breakout sessions regarding automation and scripting. Some of them were really really good with some great core-messages, for other sessions my skill-set of scripting or programming was not honestly not good enough to get it all ;-).

2014 was kind of a PowerCLI year for myself. I was automating a lot of stuff in a huge project with PowerCLI. I did not just used PowerCLI for interacting or automating vSphere object (VM, Clusters, Datastore,…) related things, but also to automate/optimize operational or implementation tasks (vCenter / SQL installation, Automatic Setup …). There are just so many amazing things you can do with Powershell/PowerCLI.

So IMO whoever is going to read this (if you are one of my students you will know this message):

Don’t be afraid of learning automation via scripts because it is related to programming.

In my opinion (and I meet/teach around 100 people a year from all kind of IT-infrastructure background) so many people are afraid because they have never been good at programming. This might be definitely correct, but there is no need to worry. I am definitely not a programmer and to be honest I am not considering me as a powershell/powercli professional as well. Nevertheless Powershell/PowerCLI makes it really easy to get started, because …

  • … the community is so f***** great.
  • … you have some sense of achievement pretty soon (I mean having an output of ‘hello world’ never really made me proud, but creating 50 VMs out of a template within 1-line in 2 minutes is somehow a really cool thing.
  • … the community is so f***** great ;-).

Automation is the future in the IT-infrastructure especially now that we are heading step-by-step towards the software-defined datacenter. Each component in the infrastructure is opening itself up via an API where we can run our code against. So what is the next step for me personally? Evolve from scripting to orchestrating.

During VMworld the session MGT2525 Chasing the White Rabbit all the Way to Wonderland: Extending vCloud Automation Center Requests with vCenter Orchestrator ()had a great outcome which order of automation is the best.

Policy driven (think about vVol/vSAN) things are probably not the things in the nearer future I will implement (I’m not a developer…….yet :P). Anyway I might be able to get much more into the whole orchestration (vCenter/vRealize Orchestrator) topic.

Working a lot in the automation field with script languages like Powershell, I realized the benefits and weaknesses of purely scripted solutions. If you want to have an automation engine done via a script language (e.g. Powershell/PowerCLI) it works pretty fine. But among other features you have to reinvent the wheel all the time. How can an object within a workflow be stored persistently? How can a workflow be pause/resumed? Functionality-extension via standardized plug-ins? How can I scale such an automation engine up? A lot of thinks will come up during the development, which have to be dealt with. Those topics are reasons where I believe that professional Orchestration solution are a much better choice. I will try to find this out and be more specific within the next months ;-).

So do we start learning this stuff? Having some chats after a #vBrownbag Session with Joerg Lew ( @joerglew – He was introduced to myself and is obviously the orchestration guru) he gave me some good advices how to start with when I want to learn about vR/vC Orchestrator.

That’s exactly what I am going to do in the next months…. When 2014 was my year of PowerCLI, 2015 will be my year of Orchestration.

So you wanna see how I am doing learning it? I try to keep you informed right here on this blog…stay tuned…

(And if I have not made any progress on automation at the end of next year…feel free to kick my ass if you see me 😉 )

© 2017 v(e)Xpertise

Theme by Anders NorenUp ↑