Just touched up on vSphere High Availability (HA). Not super complicated really HA allows for VMs to be restarted when a host goes down. You simply enable HA on the cluster and it monitors the hosts, how the cluster determines the node is down is by having the “primary host” monitor the entire cluster, vCenter will hold an “election” and pick the primary host, if the primary host is down then a new one is selected. The primary host will monitor the other, secondary, hosts as well as itself and when a host is brought down it will restart those VMs on another host. There are three types of failures with these hosts, there’s an isolation, where a single host can’t communication with any others, then there’s a partition where two or more hosts are able to communicate with each other but not the rest of the network, then finally there’s just a failure where the host is simply down. The primary host is able to confirm these failures by sending network “heartbeats” to the host every second and when it stops receiving it investigates further, such as confirming if host is communicating with datastore. In the case where network heartbeats are not being received, but the host is communicating with datastore the primary host will not employ HA as the host is connected, it will continue to monitor. You can actually also enable these heartbeats on VMs using VMware tools, so instead of just monitoring for when the host comes down the primary host will actually check if VMs are having any issues and restart them if they are. There is even an option to enable it for specific applications running on your VMs so if a critical application is down vCenter will attempt to restart the VM and bring it back up for you. Then finally there’s admission control which I won’t get into too much this post, but it allows you to set a maximum number of hosts you can tolerate failures on within vSphere with HA, if you reach this number of hosts vCenter will not allow you to create or power up more VMs as it will be reserving this space for HA.