Following on from the vRealize Operations 6.4 Install Guide this post will detail High Availability (HA) for vRealize Operations Manager. By implementing HA the analytics cluster is protected against the loss of a single node. For example should the master node fail, services will automatically fail over to the replica node within 2-3 minutes. Following a fail over the cluster runs in degraded mode, and cannot tolerate the loss of another node until the cluster is returned to HA mode through repairing or replacing the failed node, or removing it if sufficient nodes exist within the cluster. The analytics cluster is made up of the master node, replica node, and data node or nodes. It does not include any remote collector nodes.
- There should be sufficient hosts in the vSphere cluster to have no more than one node running on each host. HA does not protect against the loss of more than one node, only one replica node can be configured.
- Each node in the analytics cluster requires a static IP address.
- When adding additional nodes keep in mind the following:
- All nodes must be running the same version
- All nodes must use the same deployment type, i.e. virtual appliance, Windows, or Linux.
- All nodes must be sized the same in terms of CPU, memory, and disk.
- Nodes can be in different vSphere clusters, but must be in the same physical location and subnet.
- Time must be synchronised across all nodes.
- Click here to see a full list of multiple node cluster requirements.
- Note that for existing clusters the master node must be online to enable HA, and the cluster is restarted. This does not apply to new clusters where the cluster has not yet been started.
Deploy the Data Node
In order to configure HA a data node must be deployed and then converted into a replica node. The replica node holds a copy of all data stored in the master node. First let’s deploy our new node; download vRealize Operations Manager here.
Navigate to the vSphere web client home page, click vRealize Operations Manager and select Deploy vRealize Operations Manager.
The OVF template wizard will open. Browse to the location of the OVA file and click Next.
Enter a name for the virtual appliance, and select a location. Click Next.
Select the host or cluster compute resources for the virtual appliance and click Next. Remember this should be on a different host to the master node.
Review the details of the OVA, click Next.
Accept the EULA and click Next.
Select the same configuration size as the master node and click Next.
Select the storage for the virtual appliance, click Next. For HA you should use a different datastore to the master node.
Select the network for the virtual appliance, click Next.
Configure the virtual appliance network settings, click Next. Click Finish on the final screen to begin deploying the virtual appliance.
Configure the Replica Node
Once the virtual appliance has been deployed and is powered on, open a web browser to the FQDN or IP address configured during deployment. Select Expand Existing Installation.
Click Next to begin the setup wizard.
Enter the name of the new node, ensure Data is selected as the node type. Enter the IP address or FQDN of the master node and click Validate. Tick Accept this certificate and click Next to continue.
Enter the admin password of the master node and click Next.
You will now be returned to the cluster admin page. Note the Waiting to finish cluster expansion. Installation in progress… message; the new node is being configured and will likely take 5-10 minutes. If you want to add any additional data nodes or remote collector nodes you can repeat the process above. For the purposes of this post we are adding a single data node to be converted to a replica. When you’re ready click Finish Adding New Node(s) and Ok to continue.
Once the node to be used as a replica is online we can configure HA. Locate the High Availability section at the top right of the admin page, the status will be set to disabled, click Enable. Ensure the correct data node is selected to be converted to a replica node. Tick Enable High Availability for this cluster and click Ok.
Configuring HA can take up to 20 minutes and the cluster will restart. Click Yes to continue.
You may need to log back into the admin console. The admin console for vRealize Operations Manager can be accessed by browsing to http://<vROps>/admin where <vROps> is the IP address of FQDN of your vRealize Operations Manager appliance or server. The HA status will now show enabled and the cluster online. Note that the role of the node has now changed to Master Replica.
The same data can now be accessed via both the master node and the replica node. Consider implementing load balancing for larger environments; review the vRealize Operations Manager Load Balancing document.
The final step is to configure an anti-affinity rule to stop the master and replica nodes (and any data nodes) from running on the same hosts. Log into the vSphere web client and browse to Hosts and Clusters. Click the vSphere cluster and select the Manage tab. Under Configuration click VM/Host Rules. Under VM/Host Rules click Add.
Enter a name for the rule, such as vRealize Operations Nodes, ensure Enable rule is ticked and select Separate Virtual Machines as the rule type. Click Add and select the vRealize Operations nodes. Click Ok.
This rule will ensure DRS does not place nodes on the same hosts in a vSphere cluster.