Tag Archives: VMware Cloud

VMware Cloud on AWS Security One Stop Shop

Kicking off 2020 with the theme of the year – security.

In order to keep this content accurate the bulk of it has been quoted from the Cloud Security Alliance (CSA) VMware Cloud on Amazon Web Services (AWS) self assessment. The CSA define best practises for secure cloud computing environments. The full assessment can be found here under VMware Cloud on AWS. I have listed the key points from the CSA applicable to our customer use case below, any direct quotes are in blue text, alongside further information from the following must-read resources:

Introduction

This post is arranged into the following sections:

  • Introduction
  • Roles & Responsibilities
  • Physical Security
  • Security Operations Monitoring
  • Penetration Testing & Audit
  • Data Accessibility
  • Data Encryption

It is important to understand the service in order to secure it appropriately. Review the VMware Cloud on AWS Service Description, where page 3/11 states:

VMware Cloud on AWS (the Service Offering or VMware Cloud) brings VMware’s
enterprise class Software-Defined Data Center software to the Amazon Web Services cloud, enabling customers to run any application across vSphere-based private, public, and hybrid cloud environments.

The Service Offering has the following components:

  • Software-Defined Data Center (SDDC) consisting of:
    • VMware vSphere running on elastic bare metal hosts deployed in AWS
    • VMware vCenter Server appliance
    • VMware NSX Data Center to power networking for the Service Offering
    • VMware vSAN aggregating host-based storage into a shared datastore
    • VMware HCX enabling app mobility and infrastructure hybridity
  • Self-service provisioning of SDDCs, on demand, from vmc.vmware.com
  • Maintenance, patching, and upgrades of the SDDC, performed by VMware

The SDDC service offering uses dedicated AWS physical hardware per tenant, each ESXi host you purchase is a dedicated physical AWS bare metal server. Each customer environment is logically and physically separated, there is no multi-tenancy or nested virtualisation. The customer is in charge of their own workloads as well as ingress/egress and user access.

This post focuses on the security of the VMware Cloud on AWS platform and collates information provided by VMware. The network design is a topic that requires addressing in its own right; connectivity, default route, Internet access, firewall and load balancing, etc. To ensure you secure the network along with user access and workloads, as outlined in the next section, review in full the above documentation, the VMware  Cloud on AWS Documentation, and Reference Architectures, as well as engaging your VMware customer success or account team.

More detail on planning the SDDC deployment can be found in: How to Deploy and Configure VMware Cloud on AWSHow to Migrate VMware Virtual Machines to VMware Cloud on AWS. The following additional reading may also be of use: AWS Security by DesignSecurity, Identity, and Compliance on AWS, Humair’s BlogVMware Network Virtualisation & VMware Cloud Blog.

Roles & Responsibilities

Although VMware Cloud on AWS utilises Infrastructure as a Service (IaaS) from AWS, the customer  consumes the platform as a whole from VMware, and therefore maintains a relationship and support contract with VMware. Support for any native AWS services deployed in the customers connected AWS Virtual Private Cloud (VPC) remains with AWS as normal. The security model therefore is shared between the customer and VMware. VMware separate the roles as follows:

We (VMware) will use commercially reasonable efforts to provide:

  • Information Security: We will protect the information systems used to deliver the
    Service Offering over which we (as between VMware and you) have sole administrative level control.
  • Security Monitoring: We will monitor for security events involving the underlying
    infrastructure servers, storage, networks, and information systems used in the delivery of the Service Offering over which we (as between VMware and you) have sole administrative level control. This responsibility stops at any point where you have some control, permission, or access to modify an aspect of the Service Offering.
  • Patching and Vulnerability Management: We will maintain the systems we use to
    deliver the Service Offering, including the application of patches we deem critical for the target systems. We will perform routine vulnerability scans to surface critical risk areas for the systems we use to deliver the Service Offering. Critical vulnerabilities will be addressed in a timely manner.

You (the customer) are responsible for addressing the following:

  • Information Security: You are responsible for ensuring adequate protection of the
    Content that you deploy and/or access with the Service Offering. This includes, but is not limited to, any level of virtual machine patching, security fixes, data encryption, access controls, roles and permissions granted to your internal, external, or third party users, etc.
  • Network Security: You are responsible for the security of the networks over which you have administrative level control. This includes, but is not limited to, maintaining effective firewall rules in all SDDCs that you deploy in the Service Offering.
  • Security Monitoring: You are responsible for the detection, classification, and remediation of all security events that are isolated with your deployed SDDCs, associated with virtual machines, operating systems, applications, data, or content surfaced through vulnerability scanning tools, or required for a compliance or certification program in which you are required to participate, and which are not serviced under another VMware security program.

In a nutshell, as well as user access and connectivity, the customer is ultimately responsible for what is inside the Virtual Machine. This means things like Anti-Virus, operating system and application patches, monitoring, backups, access control / privileged access, etc. The VMware Cloud Service Offerings Terms of Service backs this up, stating:

2.2 You (the customer) are responsible for taking and maintaining appropriate steps to protect the confidentiality, integrity, and security of Your Content. Those steps include (a) controlling access you provide to your Users, (b) configuring the Service Offering appropriately, (c) ensuring the security of Your Content while it is in transit to and from the Service Offering, (d) using encryption technology to protect Your Content, and (e) backing up Your Content.

It is the customers responsibility to secure data appropriately through accessibility and authorisation. This means securing connectivity with a Virtual Private Network (VPN) or Direct Connect and maintaining associated on-premise firewalls accordingly, as well as implementing secure policies and firewall rules for the VMware Cloud Internet Gateway, VMware Cloud Compute and Management Gateways (NSX Edge Firewalls), and the NSX Distributed Firewall (Micro-Segmentation).

While the NSX Edge Firewalls protects north-south traffic; essentially anything coming in or out of the SDDC, the Distributed Firewall protects east-west traffic; between workloads inside the SDDC. Micro-Segmentation can be used to protect applications by ring-fencing virtual machines in a zero trust architecture. The VMware Cloud on AWS NSX Networking and Security eBook goes into great detail on the NSX Edge and Distributed Firewalls with screenshots and configuration examples in chapter 6 (page 83). Both firewall types are included for all virtual machines in the default VMware Cloud on AWS pricing model.

All native AWS services deployed in the connected AWS VPC fall under the customers responsibility to secure as normal with Security Groups, Access Control Lists (ACLs), Identity and Access Management (IAM) groups/roles/policies, etc. This includes the cross-VPC 25Gbps Elastic Network Interfaces (ENI) deployed to connect the SDDC with the customers VPC.

VMware Cloud on AWS customers retain control and ownership of their Customer Content and it is the customer’s responsibility to manage data retention to their own requirements. VMware Cloud on AWS backs up Account Information including system configuration settings but does not provide backup services for Customer Content.

Source: Cloud Security Alliance (CSA) VMware Cloud on Amazon Web Services (AWS) self assessment

The CSA assessment and VMware Cloud Services Security Overview go into more detail on code security, change control, quality assurance, and configuration management, however it is worth calling out patching. VMware are responsible for patching the underlying infrastructure of the platform; this includes all network, utility and security equipment. Critical security patches are ‘installed in a timely manner’, while non-critical patches are included in predefined patch schedules.

Customers have visibility into VMware SDDC products are updated from the Cloud Services Portal. In most cases updates and patches can be applied before General Availability (GA); some products run VMware Cloud specific versions and do not need to wait for the next GA release of vSphere, for example. In addition VMware has subscriptions to internal vendor security and bug-tracking notification services, meaning remediation efforts are accelerated and critical or high-risk issues prioritised, often having been applied before the vulnerability has been made public.

For more information on AWS Security by Design and the shared security model of security of the cloud and security in the cloud, start with the Introduction to AWS Security by Design Whitepaper.

Physical Security

VMware Cloud on AWS uses Amazon Web Services (AWS) geographically resilient data center hosting facilities.  Data centers are built in clusters in various global regions. VMware provides customers the flexibility to place VMware Cloud on AWS instances and store data within multiple geographic regions as well as across multiple Availability Zones within each region to minimize risk.

Physical Access is strictly controlled both at the perimeter and at building ingress/egress points and includes, but is not limited to fencing, walls, video surveillance, intrusion detection systems and other electronic biometric access controls and alarm monitoring systems managed by a 24x7x365 professional security staff.

AWS equipment is protected from outages in alignment with ISO 27001 standard. AWS has been validated and certified by an independent auditor to confirm alignment with ISO 27001 certification standard.  AWS Availability Zones are all redundantly connected to multiple tier-1 transit providers.

Customers explicitly choose which VMware Cloud on AWS data center best suits their needs, and customer data will not traverse locations without the explicit actions of the tenant administrator.

Automated processes are in place that handle media sanitization before repurposing of any hardware. Upon the explicit deletion of a production environment by a tenant, a cryptographic wipe of the hard drive is performed via destruction of keys used by the self-encrypting drives.

When a physical storage device has reached the end of its useful life, a decommissioning process that is designed to prevent customer data from being exposed to unauthorized individuals is followed using techniques detailed in NIST 800-88 (“Guidelines for Media Sanitization”) as part of the decommissioning process – the same applies when exiting the VMware Cloud service.

Source: Cloud Security Alliance (CSA) VMware Cloud on Amazon Web Services (AWS) self assessment

The exact locations of AWS data centres is generally kept secret and they do not run data centre tours (this digital tour is about as good as it gets). AWS provide regions which contain multiple Availability Zones, consisting of one or more data centres all physically separated from one another. Electrical power systems, water, telecommunications, and internet connectivity are all designed to be fault tolerant. Availability Zones are connected using private fibre-optic networking allowing customers to architect highly available solutions.

Physical access to data centres is restricted to those with valid and approved business justification. Site and server room access is limited to authorised individuals and requires point in time access with multi-factor authentication. AWS implement additional perimeter security features outlined above and monitoring for things like open doors and removal of assets.

Media storage devices used to store customer data are classified by AWS as critical and treated as high-impact throughout their life-cycle. Media is decommissioned using techniques detailed in NIST 800-88 and is not removed from AWS control until it has been securely decommissioned. AWS and employees are audited by multiple compliance programs, you can download AWS Compliance Reports from the AWS Artifact service in the AWS console.

Security Operations Monitoring

VMware monitors internal platform & systems for privacy breaches and has a breach notification process to notify customers in the event of a privacy breach. If VMware becomes aware of a security incident on VMware Cloud on AWS that leads to the unlawful disclosure or access to personal information provided to VMware as a processor, we will notify customers without undue delay, and will provide information relating to a data breach as reasonably requested by our customers. VMware will use reasonable endeavors to assist customers in mitigating, where possible, the adverse effects of any personal data breach.

VMware Cloud on AWS has the capability to detect attacks that target the virtual infrastructure.  VMware Cloud on AWS has several intrusion detection mechanisms in place.  VMware log aggregation systems continuously ingest AWS firewall, AWS security services along with Cloud Trail logs, infrastructure and VPC Flow all logs. VMware continuously collects and monitors services operation logs using SIEM technologies. The 24x7x365 VMware Security Operations Center uses the SIEM to correlate information with public and private threat feeds to identify suspicious and unusual activities.

The real-time status of the VMware Cloud on AWS services along with past incidents is publicly available here. Availability reports are available to customers upon request within 45 days after a validated SLA event.

The VMware Security Operations Center (SOC) uses Log capture and SIEM tools, security monitoring technologies and intrusion detection tools in realtime to identify unauthorized access attempts or any behaviors that would indicate abnormal activity.

All changes to the virtual machine configuration are logged and available to the customer which enables detection of tampering and integrity checking.

VMware monitors AWS infrastructure and receives notifications directly from AWS in the event of a provider failure. VMware has developed processes with AWS to ensure that that we have defined disaster recovery mechanisms in place in the event that an upstream event occurs. VMware Cloud on AWS has conducted successful DR testing and continues to test annually.

Source: Cloud Security Alliance (CSA) VMware Cloud on Amazon Web Services (AWS) self assessment

See the Data Access section for more information on access logging. For ingesting logs from VMware Cloud on AWS, as well as native AWS, and other sources, customers can use Log Insight Cloud which has both free and chargeable versions.

To address any further concerns customers can also use their own Security Information and Event Management (SIEM) tools, such as Splunk, to continuously monitor the VMware Cloud on AWS environment for any unauthorised activity. Furthermore, tools currently used to scan or secure VMware environments on-premise can mostly be carried across to the VMware Cloud on AWS environment with IPFIX and Port Mirroring. This gives the customer unprecedented visibility under the hood of a cloud environment. The VMware Cloud on AWS NSX Networking and Security eBook goes into more detail on these operational tools in chapter 7 (page 101).

As well as Log Insight Cloud and / or the customers own SIEM tools, AWS can be used to monitor the connected VPC and services. AWS CloudTrail is a service that logs all API calls associated with your account, while AWS Config provides visibility of assets through an inventory of AWS resources and a history of configuration changes to these resources. You can use AWS Config to define rules that evaluate these configurations for compliance. VMware Cloud resources deployed to your connected VPC, such as IAM configurations for SDDC formation, and the attached ENIs, can be found in AWS Config. Your AWS logs can also be added as a content pack or log source for Log Insight Cloud.

In the example screenshots below you can see part of the AWS CloudTrail logs for initial SDDC deployment. Highlighted in green is my user account linking the AWS account and running the CloudFormation template to create the appropriate IAM configuration, then a few minutes later in yellow the events for the ENIs being added and configured. You can view this in more detail in your own environment, the second screenshot shows AWS Config verifying there have been no changes made since initial deployment. I have had to remove most of the detail but you get the idea.

CloudTrailAWSConfig

For a full list of AWS services that can be used to secure your native workloads and resources see the AWS Cloud Security page. The VMware Cloud on AWS NSX Networking and Security eBook also contains information on leveraging native AWS services in chapter 9 (from page 134).

It is important to note at this point that AWS security tools can only be used in the accounts you have access to. When VMware Cloud on AWS is deployed, i.e. the Elastic Compute Cloud (EC2) bare metal instances with associated VPC and subnets, routing table, etc. the customer does not have access to the underlying account and VPC. This is where the VMware logs outlined above are used to monitor the environment.

Penetration Testing & Audit

VMware has a comprehensive vulnerability management program. As a part of the vulnerability management program, penetration tests are performed at least annually.

Penetration test results are not provided externally.  VMware Cloud on AWS is subject to regular internal and external reviews and security assessments.  As a part of the VMware Cloud on AWS vulnerability management program, results are reviewed by the VMware security team(s) and remediation is performed based on the security team’s guidance.

With prior approval, Tenants are permitted to perform vulnerability assessments against their allocated service objects. Tenants are not permitted to perform vulnerability assessments against shared VMware assets.

VMware engages independent third-party auditors to perform reviews against industry standards. VMware will furnish audit reports under NDA.

Internal audits are performed at least annually under the VMware information security management system (ISMS) program. VMware utilizes internal/external audits as a way to measure the effectiveness of the controls applied to reduce risks associated with safeguarding information and to identify areas of improvement. Audits are essential to the VMware continuous improvement programs.

External audit reports will be provided to customers under NDA. Internal audit reports are classified as VMware confidential information and are not provided to tenants. Internal audit reports are reviewed by independent third-party auditors as a part of the VMware compliance program.

Risk assessments are performed at least annually, and results are disseminated to management.  Adjustments are made to policies, procedures, standards and controls where necessary to address risks and corrective action plans.

Source: Cloud Security Alliance (CSA) VMware Cloud on Amazon Web Services (AWS) self assessment

Data Access

VMware Cloud on AWS is built on the VMware Photon OS and VMware ESXi. The VMware Cloud on AWS Operations team disables unnecessary ports, protocols and services to harden the production environment. VMware applies security templates via Group Policy Object and we further harden servers through scripts. External communication in the production environment is restricted to ports 80 and 443 and all traffic passes through a firewall before reaching proxy servers in our DMZ. Managed interfaces are configured to deny-all communications traffic by default and allow network communications traffic by exception. – VMware Cloud on AWS deployment uses AWS CloudFormation, configuration and failure remediation is also scripted and automated to ensure a consistent and secure approach.

Customers maintain control of who has access to their VMware Cloud on AWS SDDC environment. VMware Cloud on AWS supports Identity Federation between vSphere and the customer’s identity provider using SAML standards for authentication. – Role Based Access Control (RBAC) is used to assign permissions to both the vCenter Server and the Cloud Services Portal.

VMware Cloud on AWS natively supports Common Access Card Authentication and RSA SecurID Authentication to the vSphere client. Other multi-factor authentication systems can be supported via federation between vSphere and the customer’s Identity Provider.

Access control, separation of duties, and other policies define which individuals are allowed to have access to VMware Cloud on AWS management systems. Access to customer environments where customers data is stored, is limited to authorized VMware support engineers who must authenticate via two-factor authentication to an access control system in order to generate user-specific, time-limited credentials. Generation of these temporary credentials must be tied to an existing specific support incident ticket in the system.  All activity performed by the support engineers is logged while accessing customer SDDCs. – In general automated runbooks will address previously encountered issues. Execution of automated runbooks is logged and can be traced back to specific support personnel. However in the event an issue requires the VMware Cloud on AWS Site Reliability Engineering (SRE) team to access the SDDC; time-limited credentials are generated providing access to a specific SDDC for only 8 hours, and must be linked to a system generated or customer generated support ticket. All activities carried out are visible straight away to the customer via the vCenter logs.

Privileged access is logged and captured in a centralized log server.  VMware continuously collects and monitors services operation logs using SIEM technologies. The 24x7x365 VMware Security Operations Center uses the SIEM to correlate information with public and private threat feeds to identify suspicious and unusual activities.

Restricted, authorized personnel have access to the definitive central log servers for the VMware Cloud on AWS servers. Log aggregation sources and storage are protected and integrity of log data is preserved.  Security logs are stored for at least 1 year.

The Customer’s access logs are replicated to other systems where they can be viewed by customers and other individuals with appropriate approvals.

VMware has also deployed mechanisms to ensure that the log data has been properly copied, transported and securely stored to preserve the information as required to maintain full data integrity. Metadata about the environment including security logs are stored for at least 1 year.

VMware Cloud on AWS platform access controls are implemented via directory services group management.  All individuals who have access to the IT infrastructure and their level of access can be identified by enumerating the members of these dedicated groups.

VMware conducts criminal background checks, as permitted by applicable law, as part of pre-employment screening practices for employees commensurate with the employee’s position and level of access to the service.

In alignment with the ISO 27001 standard, all VMware personnel are required to complete annual security awareness training.  Personnel supporting VMware managed services receive additional role-based security training to perform their job functions in a secure manner.

All VMware personnel are required to sign confidentiality agreements as a part of onboarding. Additionally, upon hire, personnel are required to read and accept the Acceptable Use Policy and the VMware Business Conduct Guidelines.

Source: Cloud Security Alliance (CSA) VMware Cloud on Amazon Web Services (AWS) self assessment

Encryption

Key management policies and procedures are in place to guide personnel on proper encryption key management.  Access to cryptographic keys is restricted to limited operational personnel and all access is logged and monitored.  Cryptographic keys used by self-encrypting drives are managed by AWS.

All keys used in VMware Cloud on AWS are unique per tenant.  Tenant specific keys are programmatically generated by an independent and well-established certificate authority at the time of provisioning and are tied to the unique URLs created for each tenant.

All Customer Content imported to VMware Cloud on AWS is stored on dedicated physical NVMe storage hardware that is self-encrypting using XTS-AES-256. Encryption keys of the self-encrypting drives are generated in the physical SED controller and they never leave the storage device. The vSphere, (KEK), keys are encrypted with the AWS KMS (CMK), which is managed by the AWS KMS that uses FIPS 140-2 validated hardware security modules (HSM).  The DEK is encrypted using the local host KEK which are then used for encrypting and decrypting virtual machine files. The vSphere managed encryption keys can be managed/rotated by the customer at any time using the vSAN API or through the vSphere UI.

By default, all customer data at rest is also encrypted by vSAN XTS AES-256 cipher data-at-rest encryption, with two levels of keys: KEK (as the master key) and DEK (per-disk data key).

VMware personnel manage and secure the encryption certificates used to communicate with the VMware Cloud on AWS console and VMware has key management controls in place.  VMware Cloud on AWS operations have complete visibility into certificate information such as installed, expiring and revoked certificates through a certificate management dashboard.

Customers can provide their own keys for VPN connectivity and VMware Cloud on AWS fully supports the use of in-guest encryption of Customer Content which further enables customers to use additional encryption technologies of their choice as well as the key management products and processes to meet their security requirements. For customers who choose to implement in-guest encryption of their Customer Content, VMware does not manage the keys.

VMware utilizes an industry leading commercial solution to secure, store, and tightly control access to tokens, passwords, certificates, API keys, and other secrets.

Data in-transit (authentication, administrative access, customer information, etc.) is encrypted with standard encryption mechanisms (i.e. SSH, TLS). Encrypted vMotion is available at VMware Cloud on AWS between hosts inside the Cloud SDDC.

VMware provides customers with the ability to create IPSEC and SSL VPN tunnels from their environments which support the most common encryption methods including AES-256.

Source: Cloud Security Alliance (CSA) VMware Cloud on Amazon Web Services (AWS) self assessment

Whenever a host machine is removed from a cluster the data encryption keys used by the self-encrypting drives are destroyed. This cryptographic erasure ensures that there is no customer content on the drives before returning the servers to the pool of available hardware. The use of self-encrypting drives protects customers from an individual with physical access to the data centre being able to physically remove drives and access the contents of the drives.

As a further layer of security the VMware vSAN implementation for VMware Cloud on AWS has encryption enabled by default for all clusters, along with de-duplication and compression. These features are defined when a cluster is provisioned and cannot be disabled. In addition, vSAN also provides customisable storage protection policies to ensure data is tolerant to the failure of one or more physical drives in a cluster.

The vSAN storage array encryption allows customers to rotate encryption keys on demand to meet industry regulations, this can be done via vShere and API. All vSphere features such as vMotion, Distributed Resource Scheduler (DRS), and High Availability (HA) are supported with vSAN Encryption without impacting I/O performance.

 

Bridging the Gap Between NHS and Public Cloud with VMware Cloud on AWS

Following on from How VMware is Accelerating NHS Cloud Adoption, this post dives into more detail around how the UK National Health Service (NHS) can use VMware Cloud on AWS to bridge the gap between existing investments and Public Cloud.

Part 1: How VMware is Accelerating NHS Cloud Adoption

Part 2: Bridging the Gap Between NHS and Public Cloud with VMware Cloud on AWS

Example NHS VMware Cloud on AWS Use Cases

Modern Applications: The VMware strategy of late has seen a significant shift towards cloud agnostic software and the integration of cloud-native application development. VMware Cloud on AWS makes use of the full VMware Software-Defined Data Centre (SDDC) stack; enhancing security of NHS applications with micro-segmentation, and future-proofing application development with Project Pacific (Understand VMware Tanzu, Pacific, and Kubernetes for VMware Administrators).

Data Centre Expansion or Disaster Recovery: VMware Cloud on AWS can reduce NHS data centre footprint on-premise, by expanding new capacity into VMware Cloud on AWS (Deploy and Configure VMware Cloud on AWS), or through the addition of a Disaster Recovery (DR) site accompanied with VMware Site Recovery Manager (SRM). Legacy Data Centre Evacuation: VMware Cloud on AWS can replace legacy data centres by facilitating the migration of VMware Virtual Machines (VMs) from end of life hardware to VMware Cloud on AWS (Migrate VMware Virtual Machines to VMware Cloud on AWS). In some cases, dependant on internal finance policies, NHS organisations may be able to capitalise the cost of reserved instances (dedicated physical hosts for 1 or 3 years) in VMware Cloud on AWS using recently introduced IFRS 16 Leases. For more information review the Capitalising Your Cloud Booklet.

Hosting NHS Patient Data: There are a number of security principles which should be implemented to host patient or sensitive data, further information is available on the NHS Digital website. Important detail on the shared security model of Public Cloud, and further NHS, VMware, and AWS specific links, can be found in the How VMware is Accelerating NHS Cloud Adoption article, as well as VMware Cloud on AWS Security One Stop Shop. A summary excerpt is below:

“In January 2018 NHS Digital released guidance for NHS and social care data: off-shoring and the use of public cloud services, along with a toolset for identifying and assessing data risk classification. The NHS and social care data: off-shoring and the use of public cloud services guidance paper published by NHS Digital states; ‘NHS and social care organisations can safely put health and care data, including non-personal data and confidential patient information, into the public cloud’. The NHS and social care providers may use cloud computing services for NHS data, providing it is hosted in the UK, or European Economic Area (EEA), or in the US where covered by Privacy Shield.”

“Each individual data controller organisation is responsible for implementing and reviewing their own processes around data risk classifications, however to assist NHS Digital have provided a consistent health and social care data risk model. For organisations that do not yet have cloud governance in place NHS Digital have also provided guidance on the health and social care cloud risk framework.

Cloud services introduce a shared security model. NHS organisations can be compliant by implementing a cloud risk framework and proportionate controls outlined by NHS Digital; summarised in the health and social care cloud security one page overview. Security considerations for different data classifications are detailed in the health and social care cloud security – good practice guide.”

Moving to Internet First: As well as the Cloud First strategy outlined in the article referenced above, the UK Government also seeks to make public sector applications, systems, and services accessible over the Internet, with the Internet First strategy. VMware Cloud on AWS can utilise existing on-premise Health and Social Care Network (HSCN) connections, but can also offer the ideal opportunity to move services to Internet facing. This can be supported with the correct network design, and through making use of native AWS services. There is more information below on how VMware Cloud on AWS compliements Internet First, and further reading on the NHS Digital Internet First policy can be found here.

“Health and care services now have an Internet First policy that states new digital services should operate over the internet. Existing services should also be updated to do the same at the earliest opportunity and ideally by March 2021.”

Example Native AWS Service Integrations

In the example architecture below a Stretched Cluster has been deployed across 2 AWS Availability Zones in the London region (eu-west-2), providing VMware Virtual Machine (VM) availability across sites and fault domains. Amazon Direct Connect provides a private link from on-premise networks and should be deployed with resilience, a standby Virtual Private Network (VPN) encrypted connection can also be used. To see these features in action review Watch VMware vSphere HA Recover Virtual Machines Across AWS Availability Zones, and Watch a Failover from Direct Connect to Backup VPN for VMware Cloud on AWS. Optional access to the Health and Social Care Network (HSCN) is provided by the existing on-premise HSCN connection.

Example_VMC

Focusing on the VMware Cloud on AWS connectivity into native AWS services from the example architecture we can note the following:

  • Connectivity to native AWS services is provided using Elastic Network Interfaces (ENI), a 25Gbps link into Amazon’s backbone network.
  • Traffic traversing the ENI (ingress and egress) is not chargeable. Any deployed services in AWS are chargeable as normal against the connected AWS account.
  • Using a Virtual Private Cloud (VPC) endpoint NHS organisations can make use of Regional Amazon services such as Simple Storage Services (S3), which offers a tiered approach to object storage and pricing, or Glacier for data archive.
  • Using the Virtual Private Cloud (VPC) router NHS organisations can make use of services such as Elastic Compute Cloud (EC2), or managed databases with Relational Database Service (RDS).

An example scenario could be an on-premise application with a large database which does not have the development resource or funding to refactor for native Public Cloud. It could also be that refactoring this application doesn’t offer any additional business benefit or functionality. In this case the database could be migrated to RDS, and the front end web / application servers could be migrated ‘as is’ to run on VMware Cloud on AWS. Using the 25Gbps ENI would, in most cases, remove any latency concerns between the application and the database.

It is important to remember that it isn’t only the consumption of traditional infrastructure services that is on offer. Opening up existing workloads to native AWS services drives innovation and modernisation of applications. One example is Amazon’s Artificial Intelligence (AI) powered voice assistant Alexa, which now gives health advice using information from the NHS website. In addition to AI and Machine Learning, AWS has a portfolio of data lakes and analytics services, enabling cost effective methods for NHS organisations to collect, store, analyse, and share data.

Example_Native

In the case of Internet First, VMware Cloud on AWS in conjunction with native AWS can help scale and consolidate publicly accessible applications, as documented in VMware Cloud on AWS Reference Architectures. In one such example, the following AWS services are used to facilitate public services hosted in VMware Cloud on AWS:

  • Amazon Route 53 is a highly available and scalable cloud Domain Name System (DNS) web service for name resolution.
  • Elastic Load Balancing automatically distributes incoming application traffic across multiple targets. The Application Load Balancer is best suited for load balancing of HTTP and HTTPS traffic operating at the individual request level (Layer 7).
  • AWS Certificate Manager is a service that lets you easily provision, manage, and deploy public and private SSL/TLS certificates for use with AWS services and your internal connected resources.

Additional optional services for performance and security:

  • Amazon CloudFront is a fast Content Delivery Network (CDN) service that securely delivers data, videos, applications, and APIs to customers with low latency, high transfer speeds.
  • AWS Shield is a managed Distributed Denial of Service (DDoS) protection service that safeguards applications running on AWS.
  • AWS WAF is a Web Application Firewall that helps protect your web applications from common web exploits that could affect application availability or compromise security.
  • AWS CloudTrail is a service that enables governance, compliance, operational auditing, and risk auditing of your AWS account.

VMC_ELB

Further Reading: How to Deploy and Configure VMware Cloud on AWS (Part 1), How to Migrate VMware Virtual Machines to VMware Cloud on AWS (Part 2).

VMware Cloud on AWS FAQs | Resources | Documentation | Factbook | Evaluation Guide | On-Boarding Handbook | Operating Principles

How to Migrate VMware Virtual Machines to VMware Cloud on AWS

This post pulls together the workload migration planning and lessons learned notes made during a real life customer use case of evacuation an on-premise date centre to VMware Cloud (VMC) on AWS (Amazon Web Services). The content is a work in progress and intended as a generic list of considerations and useful links for both VMware and AWS, it is not a comprehensive guide. Cloud, more-so than traditional infrastructure, is constantly changing. Features are implemented regularly and transparently so always validate against official documentation. This post was last updated on September 16th 2019.

Part 1: SDDC Deployment

Part 2: Migration Planning & Lessons Learned

See Also: VMware Cloud on AWS Security One Stop Shop

1. Virtual Machine Migrations

The following points should help with the planning of Virtual Machine (VM) workload migrations to VMware Cloud on AWS. An assumption is made that the Software Defined Data Centre (SDDC) is stood up and operational with monitoring, backups, Anti-Virus, etc. in place. Review Part 1: SDDC Deployment for more information. I found the SDDC deployment and getting the environment available was the easy part. Internal processes and complexity of the existing environment are going to determine how quickly you can migrate workloads to the SDDC.

We started by exporting a list of Virtual Machines from each vCenter, from that we identified the service it was running and the service owner or business owner. The biggest surprise here was the amount of servers deployed by, or for, people who had left the organisation. These servers were still being hosted, maintained, patched, but no longer needed. We were able to decommission more workloads than expected due to years of VM sprawl. Whilst VMware Cloud on AWS isn’t directly responsible for this the project forced us to evaluate each server we hosted. For remaining workloads we put together a migration flow which identified the following criteria:

  • CPU, RAM, storage requirements: identified a baseline to automatically accept and then anything above our baseline would require a manual check.
  • Network dependencies: is there a large amount of data in transit, is IP retention required, is the VLAN stretched using Hybrid Cloud Extension (HCX), load balancer requirements.
  • Data flows: used vRealize Network Insight to identify potential egress costs and additional service dependencies.
  • Additional application or organisation specific considerations: e.g. data classification, tagging / charge-back model, backups, security, monitoring, DNS, authentication, licensing or support.
  • Service Management considerations: is the service platinum/gold/silver/bronze or unclassified, do the platform Service Level Agreements (SLAs) fulfil the existing SLAs in place for each service, is the proposed migration type (i.e. amount of downtime) taking this into consideration. Involving Service Management right from the start was useful as they were able to advise on internal processes for Service Acceptance and Business Continuity.
  • Service Owner considerations: if the technical criteria above is met then the next step was to meet with service owners and get their buy-in for the migration. We migrated internal services we owned first, and then used that as a success story to onboard other services. This process involved meeting with various departments, presenting the solution and the benefits over their existing hosting, in our case DR and performance improvements, and migrating dev or test workloads first to build confidence.
  • Migration passport: one of our Senior Engineers came up with this concept as a one-pager for each service that was migrated, it consisted of migration details (change ID, date, status), migration scope (server names, locations, and notes), firewall rules, vRNI outputs, and other information such as associated documentation.

Each environment is different so these are provided as example considerations only. Use resources such as those outlined below, and , to develop your own migration strategy.

Workload_Mobility

2. Network Design

  • Research the differences and limitations around the different VMware on AWS connection types, especially under 1Gbps – Configuring AWS Direct Connect with VMware Cloud on AWS
    • Make sure you understand the terminology around a Virtual Interface (VIF) and the difference between a Standard VIF, Hosted VIF, and Hosted Connection: What’s the difference between a hosted virtual interface (VIF) and a hosted connection? It is important to consider that VMware Cloud on AWS requires a dedicated Virtual Interface (VIF) – or a pair of VIFs for resilience. If you have a standard 1Gbps or 10Gbps connection direct from Amazon then you can create and allocate VIFs for this purpose. If you are using a hosted connection from an Amazon Partner Network (APN) for sub-1G connectivity then you may need to procure additional VIFs, or a dedicated Direct Connect with the ability to have multiple VIFs on a single circuit. This is a discussion you should have with your APN partner.

  • The Virtual Private Cloud (VPC) provided by the shadow AWS account cannot be used as a transit VPC. In other words if you want to connect to private IP addressing of native AWS services you cannot hop via VMware Cloud. In this instance a Transit Gateway can be used.
  • At the time of writing a VPN attachment must be created to connect the SDDC to a Transit Gateway, if Direct Connect is in use then the minimum requirement is 1Gbps.
  • If there is a requirement to connect multiple existing AWS VPCs, or multiple SDDCs, with on-premise networks then definitely check out VMware Cloud on AWS with Transit Gateway Demo.
  • If a backup VPN is in use then you may be able to reduce failover time using Bidirectional Forwarding Detection (BFD) which is automatically enabled by AWS, in our case it was not supported by our third party provider.
  • Use vRealize Network Insight to get an idea of dependencies and data flows that you can use to plan firewall rules and estimate egress or cross-AZ charges. In general my experience with these charges is that they have been minimal, this depends entirely on your own environment but should be considered when calculating overall VMware on AWS pricing.
  • If you want to update your default route see How to Set the Default Route in VMware Cloud on AWS: Part 1 & Part 2.
  • VMware Cloud on AWS: NSX Networking and Security eBook

3. Load Balancing & Security

  • Update Jan 2020 – see also VMware Cloud on AWS Security One Stop Shop
  • With the acquisition of Avi Networks we can expect Avi Networks services as a paid add-on for VMware Cloud: VMware Cloud on AWS: NSX and Avi Networks Load Balancing and Security.
  • Third party load balancers such as virtual F5 can be deployed in virtual appliance format. If you are planning on using AWS Elastic Load Balancer (ELB) on a private IP address accessible on-premise ensure you have a connectivity method as outlined above.
  • The NSX Distributed Firewall (DFW) feature is included in the price of VMware Cloud, the paid for message is removed from SDDC v1.8 onwards, this was announced at VMworld 2019.
  • Another VMworld 2019 announcement was the inclusion of syslog forwarding with the free version of VMware Cloud Log Intelligence (SaaS offering for log analytics), although for troubleshooting NSX DFW logs you still need the paid for version.
  • If you are using HCX this product uses its own IPSec tunnel and therefore we could not get it working with the private IP address over a backup VPN. It was assumed that HCX would also not work with the private IP address via Transit Gateway either, due to the SDDC VPN requirement, and would need to be reconfigured to use the public IP address.
  • Another HCX migration consideration is that when you are stretching a network all traffic goes via the HCX Interconnects. This means you are encapsulating everything in port UDP 4500, and essentially bypassing your on-premise firewall rules while the network is stretched. It is important to double check all rules are correct before eventually moving the gateway to VMC.
  • Again if you are doing VMware HCX migrations, remember to remove stretched networks once complete. This involves shutting down the gateway on-premise, removing the L2 stretch, and changing the network in the SDDC to routed, for us the down time was around 30 seconds. The deployment of HCX in our environment, although covered by vSphere High Availability (HA), didn’t have resilience built in, therefore we decided to minimise the amount of time they were in use by planning a migration strategy around each subnet.
  • If you use NSX Service Deployments for Anti-Virus, i.e. Guest Introspection for agentless AV then you will need to deploy an agent on each VM, as this feature is still currently unavailable.

4. General

  • The Cloud Services Portal (CSP) can be integrated with enterprise federation, allowing you to control access using your organisational policies, hopefully therefore enforcing Multi-Factor Authentication (MFA) and removing access as part of a leavers process. Federation will only work with a tenant, it will not work with a master organisation.
  • It is not possible at the time of writing to easily transfer an SDDC deployed in the root/master organisation into a tenant. The process currently is a redeploy and migrate.
  • Druva offer a product that will backup Virtual Machines from VMware Public Cloud direct into an S3 bucket they manage, for a greenfield deployment if you are not transferring any existing licenses this could be a good option as you only pay for the capacity you use. Having a backup environment setup in AWS has many benefits but also adds a management overhead and the consideration of replicating between Availability Zones.
  • In general internal support was good once teams were educated on the platform and the slightly different operating model we were implementing. In terms of external support we have not encountered any compatibility issues yet, there was one application vendor with a published KB article stating they support running the application on VMware Cloud on AWS,  then back tracked and said they wouldn’t support it as vSphere was a version not yet GA (6.8 at the time of writing).