What triggers failover in MarkLogic Server?
21 May 2020 08:32 AM
Each node in a cluster communicates with all of the other nodes in the cluster at periodic intervals. This periodic communication, known as a heartbeat, circulates key information about host status and availability between the nodes in a cluster. Through this mechanism, the cluster determines which nodes are available and communicates configuration changes with other nodes in the cluster. If a node goes down for some reason, it will stop sending heartbeat packets to the other nodes in the cluster.
The cluster uses the heartbeat to determine if a node in the cluster is down. A heartbeat message from a given node contains its view of the current state of the cluster at the moment of the heartbeat was generated. The determination of a down node is based on a vote from each node in the cluster. In order to vote a node out of the cluster, there must be a quorum of nodes voting to remove a node.
A quorum occurs if more than 50% of the total number of nodes in the cluster (including any nodes that are down) vote the same way. The voting that each host performs is done based on how long it has been since it last had a heartbeat from the other node. If at least half of the nodes in the cluster determine that a node is down, then that node is disconnected from the cluster. The wait time for a host to be disconnected from the cluster is typically considerably longer than the time for restarting a host, so restarts should not cause hosts to be disconnected from the cluster (and therefore they should not cause forests to fail over).
There are group configuration parameters to determine how long to wait before removing a node (for details, see XDQP Timeout, Host Timeout, and Host Initial Timeout Parameters).
Each node in the cluster continues listening for the heartbeat from the disconnected node to see if it has come back up, and if a quorum of nodes in the cluster are getting heartbeats from the node, then it automatically rejoins the cluster.
The heartbeat mechanism allows the cluster to recover gracefully from things like hardware failures or other events that might make a host unresponsive. This occurs automatically, without any human intervention; machines can go down and automatically come back up without requiring intervention from an administrator.
Hosts with Content Forests
If the node that goes down hosts content in a forest, then the database to which that forest belongs will go offline until the forest either comes back up or is detached from the database.
If you have failover enabled and configured for the forest whose host is removed from the cluster, the forest will attempt to fail over to a secondary host (that is, one of the secondary hosts will attempt to mount the forest). Once that occurs, the database will come back online.
For shared disk failover, there is an additional failover criteria that could prevent a forest from failing over. The forest's label file is updated regularly by the host that is managing the forest. To avoid data corruption of the data on the shared file system, the forest will not fail over when the forest is being actively managed - i.e. the forest's label file time stamp is checked to ensure that the forest is not currently being actively managed. This could occur in the situation where a host is isolated from the other nodes in the cluster, but still can access the forest data (on shared disk).
Tips for Handling Failover on A Busy Cluster
This Knowledgebase Article contains a good discussion about how to handle failover that is occuring frequently when your cluster hosts sometimes are too busy to respond in a timely manner. The section on "Improving the situation" contains step-by-step instructions for group and database settings that are tuned for a very busy cluster.