Virtualization Adapted Adapting Business Processes for Virtual Infrastrcuture (and vice-versa)

2009/06/26

VMware ESX High Availability – Tips and Tricks

VMware HA doesn’t work.

  1. Verify that host name is lowercase: hostname; hostname -s
  2. Verify that host name in /etc/hosts is lowercase
  3. Verify that search domain in /etc/resolv.conf is in lowercase
  4. Verify that host name in /etc/sysconfig/network is fqdn, all lowercase
  5. Verify that the host name in esx.conf is fqdn, all lowercase
  6. Verify that host name in DNS is lowercase: nslookup; <short hostname> (should properly resolve fqdn of host, all lowercase)
  7. Verify that all primary service consoles have the same name.
  8. Verify that all primary service consoles are in the same IP subnet.
  9. If VMotion vmkernel port is on same vSwitch as primary service console, use das.allowVmotionNetworks=1
  10. If host has multiple service consoles, use KB 1006541 and the das.allowNetwork0 HA option to ensure that only the primary service console is allowed.
  11. Verify that customer has appropriate licensing for HA, and has available licenses:  In LM Tools, perform a status inquiry, verify that cu is licensed for VC_DAS
  12. Once you have met all of the above criteria, enable HA.
  13. If, after you have verified all the above, and HA still won’t configure:
  1. On the host, stop vpxa: service vmware-vpxa stop
  2. The host will show not responding in VC after a while
  3. Disconnect the host from VC
  4. Re-connect the host to VC
  5. This will force the VPXA package to re-deploy, as well as the HA packages to re-deploy.
  6. Re-configure the hosts for HA.

Many thanks to: Kevin Riley [mailto:kriley@vmware.com]

See also:
http://vmwaretips.com/wp/2008/10/20/advanced-settings-for-vmware-ha/

http://blog.spudz.org/?p=388

http://kb.vmware.com/kb/1006541
As of VirtualCenter 2.5 Update 2 configuration of VMware High Availability fails.
An error similar to the following appears in the Tasks and Events detail:

HA agent on <esxhostname> in cluster <clustername> in <datacenter> has an error Incompatible HA Networks:

Cluster has network(s) missing on host: x.x.x.x

Consider using the Advanced Cluster Settings das.allowNetwork to control network usage.
das.allowVmotionNetworks
– Allows for a NIC that is used for VMotion networks to be considered
for VMware HA usage. This parameter enables a host that has only one
NIC configured for management and VMotion combined to be used in VMware
High Availability communication. By default, any VMotion network is
ignored.
das.allowNetwork[…] – Allows the use of port group names
to control the networks used for VMware HA. The value is set as the
name of the portgroup, for example, Service Console or Management
Network . When configured, the VMware HA cluster only uses the
specified networks for VMware HA communication.

To configure VMware HA to use the new settings:
Log in to VirtualCenter with the VI Client as an administrator.
Edit the settings of the cluster and deselect Enable VMware HA.
Click OK, and wait for the servers to unconfigure for VMware HA.
Click
ESX Server > Configuration > Networking on each of the ESX hosts
in the cluster and note the portgroups that are common between the
servers.
Edit the settings of the cluster, and select Enable VMware HA.
Click VMware HA.
Click Advanced Options.
Add the das.allowNetworkX option with a value of the portgroup name, where X is a number between 1 and 10,

IR: Wednesday, June 24, 2009

Powered by WordPress