vSphere Replication is a really cool application helping us to replicate our virtual machines on a VM-level without the need to have a dedicated and replicating storage. Besides the conservative replication methods it can also be used to replicate to a public cloud provider (call it DaaS or whatever 😉 like VMware vCloud Air (I am still lacking technical-deep knowledge of VMware’s hybrid cloud appriache. It will be a seperate blog-post once I know more;-).

In the following I want to give an architectural overview of the new version 6.0 of vSphere replication. I realized during some of my Site Recovery Manager class people might get confused with some KB-mentioned terminologies and maximums so I wanted to create something that clarifies all of those things.

This article is the first part of the vSphere replication 6.0 series (not sure if series is the right word if only 3 episodes are planned 😉 )

General features and what’s new in vSphere Replication 6.0 (NEW)

  • Replication on a virtual machine level
  • Replication functionality embedded in the vmkernel as vSphere replication agent
  • RPO (Recovery Point Objective = data that can be lost): 15min – 24hours
  • Quiescing of the guest-OS to ensure crash-consistency for Windows (via VSS) and Linux (NEW)
  • Replication indepedant of theunderlaying storage technology (vSAN, NAS, SAN, Local)
  • Support for enhanced vSphere functionalities (Storage DRS, svMotion, vSAN)
  • Initial replication can be optimized by manually transferring the virtual machine to the recovery location (replication seeds)
  • Transferring of the changed blocks (not CBT) and initial synchronization can be compressed and therefore minimize the required network-bandwidth (NEW)
  • Network can be more granular configured and more isolated from each other (NEW) with VMKernel functions for NFC and vSphere Replication (vSphere 6.0 required)
  • Up to 2000 Replications (NEW)

Components required for vSphere replication

For vSphere replication we need besides to the mandatory components (vCenter, SSO, Inventory Service, Webclient and ESXi) download the vSphere replication appliance from VMware.com.

The general task of the vSphere replication appliance is to get data  (VM files and changes) from the vSphere agent of a protected ESXi and transfer it to the configured recovery ESXi (via a mechanism called Network File Copy – NFS).

Now it might get a little bit confusing. The appliance we are downloading are in fact 2 different appliances with 2 different OVF-file pointing to the same disk file.

  1. 1x OVF (vSphere_Replication_OVF10.ovf) which is the virtual machine descriptor for the vSphere replication (manager) appliance – used for SRM OR single vSphere Replication – 2 or 4 vCPU / 4GB RAM
  2. 1x OVF (vSphere_Replication_AddOn_OVF10.ovf) which is the virtual machine descriptor for the vSphere replication server – can be used to balance the load and increase the maximum number of replicated VMs – 2 vCPU / 512 MB RAM

vSphere replication (manager) appliance

The vSphere replication (manager) appliance is the ‘brain’ in the vSphere replication process and is registered to the vCenter so that the vCenter-Webclient is aware of the new functionality. It stores the configured data in the embedded postgres-SQL database or in an externally added SQL database. The VMware documentation typically talks about the vSphere replication appliance, to make sure not to mix it up with the replication server I put the (manager) within the terminology. The vSphere replication (manager) appliance includes also the 1st vSphere replication Server. Only 1 vSphere replication appliance can registered with a vCenter and supports theoretically up to 2000 replications if we have 10 vSphere replication server in total. Please be aware of the following KB if you want to replicate more than 500 VMs since minor changes at the Appliance are mandatory.

vSphere replication server

The vSphere replication server in general is responsible for the replication job (data gathering from source-ESXi and data transferring to target-ESXi). It is included within the vSphere replication appliance and can effectively handle 200 replication jobs. Even though it I have read in some blogs that it is only possible to spread the replication load over several vSphere replication server in conjunction with Site Recovery Manger it works out of the box without the need for Site recovery manager.

Sample Architecture

The following picture should illustrate the components and traffic flow during the vSphere replication process.

vsphere_replication_overview

 

The following diagrams shows two sample Architectures regarding the network.

In the first diagram the vSphere replication network MUST be routed/switched on layer 3, while in the  second example we are able to stay in a single network segmet with our replication-traffic (thanks to the new VMkernel functionalites for NFC/Replication traffic in vSphere 6).

Option 1: External routing/switch mandatory (would be a good use case for the internal routing of NSX ;-)):

vsphere_replication_design1

Option 2: No routing mandatory & switching occurs within the ESXi

vsphere_replication_design2

Of course those are only two simple configuration samples, but I want to make you aware of that the (virtual) network design has an impact on the replication performance in the end.

I will focus on the performance difference between those two options (and the usage of the compression mode within an LAN) in part 2. Stay tuned 😉