Watch VMware vSphere HA Recover Virtual Machines Across AWS Availability Zones

This post demonstrates a simulated failure of an Availability Zone (AZ), in a VMware Cloud on AWS stretched cluster. The environment consists of a 6 host stretched cluster in the eu-west-2 (London) region, across Availability Zones eu-west-2a and eu-west-2b.

The simulation was carried out by the VMware Cloud on AWS back-end support team, to help with gathering evidence of AZ resilience. Failover works using vSphere High Availability (HA), in the event of a host failure HA traditionally brings virtual machines online on available hosts in the same cluster. In this scenario when the 3 hosts in AZ eu-west-2a are lost, vSphere HA automatically brings virtual machines online on the remaining 3 hosts in AZ eu-west-2b. High Availability across Availability Zones is facilitated using stretched networks (NSX-T) and storage replication (vSAN).

AWS Terminology: Each Region is a separate geographic area. Each Region has multiple, isolated locations known as Availability Zones. Each Region is completely independent. Each Availability Zone is isolated, but the Availability Zones in a Region are connected through low-latency links. An Availability Zone can be a single data centre or data centre campus.

VMC_Environment

You may also want to review further reading: How to Deploy and Configure VMware Cloud on AWS (Part 1), How to Migrate VMware Virtual Machines to VMware Cloud on AWS (Part 2), plus additional demo post Watch a Failover from Direct Connect to Backup VPN for VMware Cloud on AWS. For more information on Stretched Clusters for VMware Cloud on AWS see Overview and Documentation, as well as the following:

VMware FAQ | AWS FAQ | Roadmap | Product Documentation | Technical Overview | VMware Product Page | AWS Product Page | Try first @ VMware Cloud on AWS – Getting Started Hands-on Lab

Availability Zone (AZ) Outage

Before beginning it is worth re-iterating that the following screenshots do not represent a process, the customer / consumer of the service does not need to intervene unless a specific DR strategy has been put in place. In the event of a real world outage everything highlighted below happens automatically and is managed and monitored by VMware. You will of course want to be aware of what is happening on the platform hosting your virtual machines and that is why this post will give you a feel of what to expect, it may seem a little underwhelming as it does just look like a normal vSphere HA failover.

When we start out in this particular environment the vCenter Server and NSX Manager appliances are located in AZ eu-west-2a.

vcenter-2a

nsx-2a

The AZ failure simulation was initiated by the VMware back-end team. At this point all virtual machines in Availability Zone eu-west-2a went offline, including the example virtual machines screenshot above. As expected, within 5 minutes vSphere HA automatically brought the machines online in Availability Zone eu-west-2b. All virtual machines were accessible and working without any further action.

The stretched cluster now shows the hosts in AZ eu-west-2a as unresponsive. The hosts in AZ eu-west-2b are still online and able to run virtual machines.

Host-List

The warning on the hosts located in AZ eu-west-2b is a vSAN warning because there are cluster nodes down, this is still expected behaviour in the event of host outages.

eu-west-2b

The vCenter Server and NSX Manager appliances are now located in AZ eu-west-2b.

vcenter-2b

nsx-2b

Availability Zone (AZ) Return to Normal

Once the Availability Zone outage has been resolved, and the ESXi hosts are booted, they return as connected in the cluster. As normal with a vSphere cluster Distributed Resource Scheduler (DRS) will then proceed to balance resources accordingly.

Host-List-Normal

The vSAN object resync takes place and the health checks all change to green. Again this is something that happens automatically, and is managed and monitored by VMware.

vSAN-1

vSAN-2

Using a third party monitoring tool we can see the brief outage during virtual machine failover, and a server down / return to normal email alert generated for the support team.

Monitoring

This ties in with the vSphere HA events recorded for the ESXi hosts and virtual machines which we can of course view as normal in vCenter.

VM-Logs

 

How to Configure AWS Direct Connect with VMware Cloud on AWS

This post talks about the setup of AWS Direct Connect with VMware Cloud (VMC) on AWS. Direct Connect provides a high-speed, low latency connection between Amazon services and your on-premises environment. Direct Connect is useful for those who want dedicated private connectivity with a consistent network experience in comparison with internet-based VPN connections.

Direct Connect traffic travels over one or more virtual interfaces that you create in your customer AWS account. For SDDCs in which networking is supplied by NSX-T, all Direct Connect traffic, including vMotion, management traffic, and compute gateway traffic, uses a private virtual interface. This establishes a private connection between your on-premises data center and a single Amazon VPC.

You can create multiple interfaces to allow for redundancy and greater availability.”

Using AWS Direct Connect with VMware Cloud on AWS

Make sure you understand the terminology around a Virtual Interface (VIF) and the difference between a Standard VIF, Hosted VIF, and Hosted Connection: What’s the difference between a hosted virtual interface (VIF) and a hosted connection? It is important to consider that VMware Cloud on AWS requires a dedicated Virtual Interface (VIF) – or a pair of VIFs for resilience. If you have a standard 1Gbps or 10Gbps connection direct from Amazon then you can create and allocate VIFs for this purpose. If you are using a hosted connection from an Amazon Partner Network (APN) for sub-1G connectivity then you may need to procure additional VIFs, or a dedicated Direct Connect with the ability to have multiple VIFs on a single circuit. This is a discussion you should have with your APN partner.

Firstly review the pre-requisites and steps to request an AWS Direct Connection connection at Getting Started with AWS Direct Connect. The steps below will walk through configuring Direct Connect for use with VMware Cloud on AWS once the initial connection with Amazon or Amazon partner has been setup. Also review Direct Connect Pricing.

Direct Connect VMC Setup

Log into the VMware on AWS Console, from the SDDCs tab locate the appropriate SDDC and click View Details. Select the Networking & Security tab. Under System click Direct Connect. Make a note of the AWS Account ID, this is the shadow AWS account setup for VMC, you will need this account ID to associate with the Direct Connect.

VMC_DX_1

Log into the AWS console and navigate to the Direct Connect service. If you have not already accepted the connection from your third party provider then review the Amazon documentation referenced above.

AWS_DX_1

Select Virtual Interfaces and click Create Virtual Interface. In this instance we are creating a private VIF. Select the physical connection to use and give the virtual interface a name. Change the virtual interface owner to Another AWS Account and enter the VMC shadow AWS account ID. Fill in the VLAN and BGP ASN information provided by your connection provider. Repeat the process if you are assigning more than one VIF.

AWS_DX_2

Once the VIF or VIFs are created you will see a message that they need to be accepted by the account we have set as owner.

AWS_DX_3

Go back to the VMC portal and the Direct Connect page, click Refresh if necessary. Any interfaces associated with the shadow AWS account will now be listed as available.

VMC_DX_2

Attach the virtual interfaces and confirm acknowledgement that you will be responsible for any data transfer charges that are incurred.

VMC_DX_3

At this point it will take up to 10 minutes for the state of each interface to change from Attaching to Attached, and the BGP status to change from Down to Up. You should now see Advertised BGP Routes listing the network segments you have configured, and Learned BGP Routes listing the subnets peering from your on-premises network.

Click Overview. The Direct Connect shows green, the corresponding VIFs in the AWS Direct Connect page show green and available.

Direct_Connect_Up_VMC

For Direct Connect deep dives review the following blog posts by Nico Vibert: AWS Direct Connect – Deep Dive and Integration with VMware Cloud on AWS, and Direct Connect with VMware Cloud on AWS with VPN as a back-up.

Further Reading: How to Deploy and Configure VMware Cloud on AWS (Part 1), How to Migrate VMware Virtual Machines to VMware Cloud on AWS (Part 2).

Load Balancing VMware Cloud on AWS with Amazon ELB

This post demonstrates the connectivity between VMware Cloud (VMC) on AWS and native AWS services. In the example below we will be using Amazon Elastic Load Balancing (ELB) to provide highly available, scaleable, and secure load balancing backed by virtual machines hosted in the VMware Cloud Software-Defined Data Centre (SDDC). There is an assumption you have a basic understanding of both platforms. Further Reading: How to Deploy and Configure VMware Cloud on AWS (Part 1), How to Migrate VMware Virtual Machines to VMware Cloud on AWS (Part 2).

When integrating with Amazon ELB there are 2 options: Application Load Balancer (ALB) which operates at the request layer (7), or Network Load Balancer (NLB) which operates at the connection layer (4). The Amazon Classic Load Balancer is for Amazon EC2 instances only. For assistance with choosing the correct type of load balancer review Details for Elastic Load Balancing Products and Product Comparisons. Amazon load balancers and their targets can be monitored using Amazon Cloud Watch.

Connectivity Overview

  • Update Feb 2020 – full details can be found at AWS Native Services Integration With VMware Cloud on AWS
  • VMware Cloud on AWS links with your existing AWS account to provide access to native services. During provisioning a Cloud Formation template will grant AWS permissions using the Identity Access Management (IAM) service. This allows your VMC account to create and manage Elastic Network Interfaces (ENI) as well as auto-populate Virtual Private Cloud (VPC) route tables.
  • An Elastic Network Interface (ENI) dedicated to each physical host connects the VMware Cloud to the corresponding Availability Zone in the native AWS VPC. There is no charge for data crossing the 25 Gbps ENI between the VMC VPC and the native AWS VPC, however it is worth remembering that data crossing Availability Zones is charged at $0.01 per GB (at the time of writing).
  • An example architecture below shows a stretched cluster in VMware on AWS with web services running on virtual machines across multiple Availability Zones. The load balancer sits in the customers native AWS VPC and connects to the web servers using the ENI connectivity. Amazon’s DNS service Route 53 routes users accessing a custom domain to the web service.
  • Remember to consider the placement of your target servers when deploying the Amazon load balancer. For more information see Planning Your VMware Cloud on AWS Deployment. See also Elastic Load Balancing Pricing.

VMC_LoadBalancing

VMC Gateway Firewall

Before configuring the ELB we need to make sure it can access the target servers. Log into the VMware on AWS Console, from the SDDCs tab locate the appropriate SDDC and click View Details. Select the Networking & Security tab, under Security click Gateway Firewall and Compute Gateway.

VMC_ELB_FW

In this example I have added a rule for inbound access to my web servers. The source is AWS Connected VPC Prefixes (this can be tied down to only allow access from the load balancer if required). The destination is a user defined group which contains the private IPv4 addresses for the web servers in VMC, and the allowed service is set to HTTP (TCP 80).

If you are using the Application Load Balancer then you also need to consider the security group attached to the ALB. If the default group is not used, or the security group attached to the Elastic Network Interfaces has been changed, then you may need to make additional security group changes to allow traffic between the ALB and the ENIs. Review the Security Group Configuration section of Connecting VMware Cloud on AWS to EC2 Instances for more information. The Network Load Balancer does not use security groups. The gateway firewall rule outlined above will be needed regardless of the load balancer type.

ELB Deployment

Log into the VMware on AWS Console, from the SDDCs tab locate the appropriate SDDC and click View Details. Select the Networking & Security tab. Under System click Connected VPC. Make a note of the AWS Account ID and the VPC ID. You will need to deploy the load balancer into this account and VPC.

Log into the AWS Console and navigate to the EC2 service. Locate the Load Balancing header in the left hand navigation pane and click Load Balancers. Click Create Load Balancer. Select the load balancer type and click Create.

VMC_ELB

Typically for HTTP/HTTPS the Application Load Balancer will be used. In this example since I want to deploy the load balancer to a single Availability Zone for testing I am using a Network Load Balancer, which can also have a dedicated Elastic (persistent public) IP.

Enter the load balancer configuration. I am configuring an internet-facing load balancer with listeners on port 80 for HTTP traffic. Scroll down and specify the VPC and Availability Zones to use. Ensure you use the VPC connected to your VMware on AWS VPC. In this example I have selected a subnet in the same Availability Zone as my VMware Cloud SDDC.

VMC_NLB_1

In the routing section configure the target group which will contain the servers behind the load balancer. The target type needs to be IP.

VMC_NLB_2

In this instance since I am creating a new target group I need to specify the IP addresses of the web servers which are VMs sitting in my VMC SDDC. The Network column needs to be set to Other private IP address.

VMC_NLB_3

Once the load balancer and target group are configured review the settings and deploy. You can review the basic configuration, listeners, and monitoring by selecting the newly deployed load balancer.

VMC_NLB_4

Click the Description tab to obtain the DNS name of the load balancer. You can add a CNAME to reference the load balancer using Amazon Route 53 or another DNS service.

VMC_NLB_5VMC_NLB_6

Finally, navigate to Target Groups. Here you can view the health status of your registered targets, and configure health checks, monitoring, and tags.