This second post in a new lab series provides a walkthrough for installing the latest iteration of vSAN 7. At the time of writing the latest version of vSAN is vSAN 7.0 Update 1. To read about what’s new see vSphere 7 and vSAN 7 Headline New Features.
VMware vSAN is a software-defined storage solution baked directly into the vSphere hypervisor. vSAN enables aggregation of local or directly-attached devices and pools them together across hosts in a vSphere cluster to provide a single shared storage pool. Functionality is abstracted from the underlying hardware and managed at a software level, within vCenter, to provide granular policy based availability and controls. Non-disruptive scale out can be achieved by adding more ESXi hosts, either in the same cluster or a new cluster, and scale up by adding more disks to the existing hardware. Multiple vSAN clusters can be created and managed within a single vCenter Server. Since vSAN is already implemented directly into ESXi, activating the functionality simply requires planning and enabling the configuration, along with the appropriate VMware vSAN licenses.
In this example vSAN will be configured in a lab environment using a 2 host cluster (Intel NUC Bean Canyon) running vSphere 7 U1C, with a third node acting as the vSAN witness. As of vSAN 7.0 U1 a single witness appliance can support up to 64 2-node clusters. If you’re looking for more information on running a vSphere lab on the Intel NUC range check out the VMware Homelab section of virten.net, which has some great guides and resources.
vSAN 7.0 Install Guide
vSAN can be configured in an all-flash or hybrid setup. In a hybrid setup, flash is used for the cache with spinning disks providing the capacity tier. Although all local capacity devices are pooled together and shared across hosts in the cluster; an optimal vSAN configuration will contain hosts with the same or similar physical storage configurations, balancing storage devices consistently across the cluster. That said, hosts without any contributing storage can also join the cluster and run virtual machines. In this type of setup, planning the deployment to cover fault tolerance and protection against loss of specific contributing nodes is of particular importance.
All hosts contributing storage devices to the cluster must include at least one flash device for local cache, alongside at least one capacity device. For hybrid configurations, the flash device must be a minimum of 10% of the anticipated consumed storage of the capacity tier, and this should account for future growth to prevent reduced performance over time as the consumed storage grows. The cache for each host in any setup does not count towards the overall size of the shared datastore. Cache and capacity devices in a host form one or more disk groups, outlined in the high level image below. For more information on capacity and sizing considerations when designing a vSAN deployment, review the VMware vSAN Design Guide and the Designing and Sizing a vSAN Cluster documentation.
VMware vSAN is an enterprise solution and supports all VMware features that rely on shared storage, like High Availability, Distributed Resource Scheduler, and Storage vMotion. vSAN also includes features like stretched clustering, and fault domain implementations. Hosts in a vSAN cluster can also mount other VMFS and NFS datastores, although vSAN itself does not require or rely on any kind of external storage or Storage Area Network (SAN). You can find more information in the vSAN Planning and Deployment – VMware vSphere 7.0 documentation, which should be studied before configuring vSAN, along with the relevant release notes – in this example I am using vSAN 7.0 Update 1.
- VMware vSAN can be built on the following hardware:
- vSAN ReadyNode – preconfigured solutions using hardware tested and certified for vSAN by the server OEM and VMware
- Turn key deployments – fully packaged Hyper-Converged Infrastructure (HCI) solutions like Dell EMC VxRail
- Custom solution – hardware components compiled by the user, all hardware used with vSphere 7 and vSAN 7 must be listed in the VMware Compatibility Guide
- To check version compatibility with other VMware products, see also the VMware Product Interoperability Matrices.
- A standard vSAN cluster needs at least 3 hosts, with a maximum of 64. At least 4 hosts are recommended for maximum availability due to limitations around maintenance and protection after a failure with 3-host clusters. The 2-host vSAN cluster with witness is also a separate configuration and exception.
- Each physical host contributing capacity to the vSAN cluster requires:
- 1 x SAS or SATA HBA, or RAID controller in passthrough mode
- 1 x SAS or SATA SSD, or PCIe flash device, for the cache
- At least 1 x (further) SAS or SATA SSD, or PCIe flash device, for capacity in an all-flash disk group, OR; at least 1 x SAS or NL-SAS magnetic disk, for capacity in a hybrid disk group, with no existing partition configuration in both cases
- A minimum of 8 GB RAM, but in most cases it is preferable to have at least 32 GB RAM
- Dedicated 1 Gbps bandwidth for hybrid configuration (10 Gbps recommended), OR; dedicated or shared 10 Gbps for all-flash configurations (25 Gb, 40 Gb, and 100 Gb are also supported) – for best results new environments should consider 25 Gbps connectivity using vSphere Distributed Switches with Network I/O Control (vSphere Standard Switches are also supported but do not offer QoS)
- A configured VMkernel network adapter for vSAN traffic
- A maximum network latency of 1 ms RTT for standard vSAN clusters (200 ms to a witness node, 5 ms for stretched clusters)
- Layer 2 or Layer 3 network connectivity between hosts in the cluster (jumbo frames are supported but not required, if jumbo frames are already in use then the setting should be configured end-to-end across the environment)
- A valid vSAN license, normally managed per CPU although per OSI licensing is available for branch office configurations
- When sizing a vSAN cluster keep in mind the total capacity of all disks pooled together is only the raw capacity. True payload capacity can be calculated using the primary level of failures to tolerate, in conjunction with the failure tolerance method (RAID). For more information review the Designing and Sizing a vSAN Cluster documentation.
- Prior to vSAN 7.0 U1, a general recommendation to keep the vSAN datastore below 70% usage was made. The latest release has made substantial improvements to improve usage of free capacity, and as such can be calculated per cluster based on variables outlined in the Designing for Capacity section of the VMware vSAN Design Guide.
- It is good practice to synchronise ESXi and vCenter versions, and run the latest release. Hosts should also be in the same L2 subnet for best networking performance.
- If your environment has firewalls review the list of Required ports for vSAN.
- For larger enterprise environments see also the vSAN Configuration Limits.
In this example we’ll use the vSphere Cluster Quickstart page to configure vSAN. Quickstart consolidates the storage and networking workflows required to activate vSAN. A new cluster has been created containing 2 ESXi hosts running 7.0 U1C. The hosts are in maintenance mode and have no existing datastores or partition information beyond the standard boot disk. Both hosts are using PCIe flash devices in passthrough mode.
A third host will act as the witness node. The witness for a 2-host vSAN cluster needs to have available disks for writing metadata; at least 10 GB cache and 15 GB capacity. All 3 hosts need a VMkernel port configured. Since this is a lab environment, with limited physical connections and bandwidth, I have configured the management vmk port to also be used for vSAN traffic. The vmk port is a virtual adapter used to handle VMware service traffic for various functionality. If you need guidance on setting up the VMkernel adapter for vSAN, see the How to Configure vSAN VMkernel Networking Knowledge Base page.
Now that the VMkernel ports are setup for vSAN traffic, and there is IP-reachability between the vSAN cluster hosts and witness node, we can start the vSAN configuration. Select the cluster in the vSphere client and click Configure > Quickstart. For stage 1 click Edit and select the vSAN service. After a couple of seconds the pre-requisite health checks in stage 2 are complete. Providing no issues arise move on to stage 3 and click Configure.
Configure the network settings for the vSAN cluster. The Quickstart setup uses vSphere Distributed Switches, which are recommended, although vSphere Standard Switches are also supported. In my lab, since I already enabled vSAN traffic on the management port, I can skip the Distributed Switch setup, and click Next.
Configure the vSAN cluster settings, like encryption, compression, and deduplication, as required. In this example I am using the Two node vSAN cluster deployment type. Click Next.
Select the disks and tier to be claimed for the vSAN cluster. Remember that vSAN can only use local or direct-attached storage, and not remote storage. In this example 2 x 500 GB flash devices have been allocated to the capacity tier, and 2 x 50 GB flash devices have been allocated to the cache tier. The total of the claimed disks is 1.07 TB. This does not provide any component failure protection and is only for lab purposes. I accept the recommended configuration and click Next.
Since my vSAN cluster is only 2-nodes, I need to add a witness host. The witness host, with available disks for metadata, and vSAN enabled VMkernel adapter for communication, is selected and passes the compatibility checks. Click Next to continue.
Claim the disks for the witness host to use, in this case I have allocated a 10 GB disk for the cache tier metadata, and 15 GB disk for the capacity tier metadata. Click Next to continue.
Review the settings configured and click Finish to deploy the vSAN configuration. Although the Quickstart interface returns a message pretty quickly saying the cluster is configured, keep an eye on activity in the Recent Tasks pane as there is likely still configuration taking place.
The easiest way to check the vSAN status is to select the cluster, click Monitor, and scroll down to vSAN. Skyline Health will show the vSAN health checks associated with the cluster, you can also see physical and virtual object states, capacity and performance.
To view or manually edit the cluster settings select the cluster, click Configure and scroll down to vSAN. Services shows the available vSAN services and their configuration, in my lab environment most of these are disabled. Disk Management shows the configured disk groups and their health state. In this lab scenario I only have 2 fault domains configured.
Fault domains allow grouping together of physical hosts to protect against common failures like chassis or racks. It is best practice to configure consistent fault domains with the same number of hosts across the environment. Consider the impact on placement of data and overall number of host failures to tolerate when configuring fault domains. Clearly for a lab environment or a 2-node cluster in a small branch office setup fault domains and data availability cannot be applied in the same way as larger deployments. The following resources will help with designing such environments:
Finally, if you want to create a new storage policy to apply to the vSAN datastore, or create multiple granular policies that can be applied at VM or VMDK level, this can be done from the Menu dropdown, Policies and Profiles, VM Storage Policies. If you need more information on the policy options available review the VM Storage Policy Design Considerations documentation.