Here’s a lesson in checking the basics! I added new ESXi 5 host to a cluster today and spent a good couple of hours troubleshooting the error:
vSphere HA agent for host [Host’s Name] has an error in [Cluster’s Name] in [Datacenter’s Name]: vSphere HA agent cannot be correctly installed or configured
After a few basic checks, migrating the host in and out of the cluster and rebooting, I headed off to google and began troubleshooting.
Cannot install the vSphere HA (FDM) agent on an ESXi host - this article suggests that the host is in lockdown mode. This is unlikely since we don’t use lockdown mode, but I checked anyway:
Get-vmhost esxi001.definit.co.uk | select Name,@{N="LockDown";E={$_.Extensiondata.Config.adminDisabled}} | ft -auto Name,LockDown
This returned false - no lockdown.
To exit lockdown mode, you can use:
(get-vmhost esx001.definit.co.uk | get-view).ExitLockdownMode()
I spent a good amount of time going through the list on Troubleshooting VMware High Availability (HA) in vSphere which isn’t entirely ESXi relevant but has some good pointers nonetheless.
I finally got to Reconfiguring HA (FDM) on a cluster fails with the error: Operation timed out, with the following gem of info:
_ This issue occurs if the vSphere High Availability Agent service on the ESXi host is stopped. _
*Facepalm* - I checked the services and set the service to start and stop automatically. HA is now happily configured.
No matter how much you know, you gotta check the basics!