Knowledgebase:
NTP Host Configuration using chronyd or ntpd
16 February 2021 03:20 PM

Summary

Clock synchronization plays a critical part in the operation of a MarkLogic Cluster.

MarkLogic Server expects the system clocks to be synchronized across all the nodes in a cluster, as well as between Primary and Replica clusters. The acceptable level of clock skew (or drift) between hosts is less than 0.5 seconds, and values greater than 30 seconds will trigger XDMP-CLOCKSKEW errors, and could impact cluster availability.

Tools

Network Time Protocol (NTP) is the recommended solution for maintaining system clock synchronization.  NTP services can be provided by public (internet) servers, private servers, network devices, peer servers and more.

NTP Basics

NTP uses a daemon process (ntpd) that runs on the host.  The ntpd periodically wakes up, and polls the configured NTP servers to get the current time, and then adjust the local system clock as necessary.  Time can be adjusted two ways, by immediately changing to the correct time, or by slowly speeding up or slowing down the system clock as necessary until it has reached the correct time. The frequency that the ntpd wakes up, called the polling interval, can be adjusted based on the level of accuracy needed anywhere between 1 and 17 minutes.  NTP uses a hierarchy of servers called a strata.  Each strata synchronizes with the layer above it, and provides synchronization to the later below it.

Public NTP Reference Servers

There are many public NTP reference servers available for time synchronization.  It's important to note that the most common public NTP reference server addresses are for a pool of servers, so hosts synchronizing against them may end up using different physical servers.  Additionally, the level of polling recommended for cluster synchronization is usually higher, and excessive polling could result in the reference server throttling or blocking traffic from your systems.

Stand Alone Cluster

For a cluster that is not replicated or connected to another cluster in some way, the primary concern is that all the hosts in the cluster be in sync with each other, rather than being accurate to UTC.

Primary/Replica Clusters

Clusters that act as either Primary or Replicas need to be synchronized with each other for replication to work correctly.  This usually means that the hosts in both clusters should reference the same NTP servers.

NTP Configuration

Time Synchronization Configuration Files

It is common to have multiple servers referenced in the chronyd configuration file, /etc/chrony.conf or the ntpd configuration file, /etc/ntpd.conf. NTP may not choose the server based on the order in the file.  Because of this, hosts could synchronize with different reference servers, introducing differences in the system clocks between the hosts in the cluster. Most organizations may have devices that can act as NTP servers in their infrastructure already, as many network devices are capable of acting as NTP servers, as are Windows Primary Domain Controllers.  These devices can use default polling intervals, which avoids excessive polling against public servers.

Once you have identified your NTP server, you can configure the NTP daemon on the cluster hosts. We suggest using a single reference server for all the cluster hosts, then add all the hosts in the cluster as peers of the current node.  We also suggest adding an entry for the local host as it's own server, assigning it a low strata. Using peers allows the cluster hosts to negotiate and elect a host to act as the reference server, providing redundancy in case the reference server is unavailable.

Common Configuration Options

The burst option sends a burst of 8 packets when polling to increase the average quality of time offset statistics.  Using it against a public NTP server is considered abuse.

The iburst sends a burst of 8 packets at initial synchronization which is designed to speed up the initial synchronization at startup.  Using it against a public NTP server is considered aggressive.

The minpoll and maxpoll settings are measured in seconds to the power of two, so a setting of 4 is 16 seconds, so setting minpoll and maxpoll to 4 will cause the host to check time approximately every minute.

Time Synchronization with chronyd

The following is a sample chrony.conf file:

# Primary NTP Source

server *.*.*.200 burst iburst minpoll 4 maxpoll 4

# Allow peering as a backup to the primary time servers

peer mlHost01 burst iburst minpoll 4 maxpoll 4
peer mlHost02 burst iburst minpoll 4 maxpoll 4
peer mlHost03 burst iburst minpoll 4 maxpoll 4

# Serve time even if not synchronized to a time source (for peering)
local stratum 10

# Allow other hosts on subnet to get time from this host (for peering)
# Can also be specified by individual IP
# https://chrony.tuxfamily.org/manual.html#allow-directive
allow *.*.*.0

# By default chrony will not step the clock after the initial few time checks.
# Changing the makestep option allows the clock to be stepped if its offset is larger than .5 seconds.
makestep 0.5 -1

The other settings (driftfile, rtsync, log) can be left as is, and the new settings will take effect after the chronyd service is restarted.

Time Synchronization with ntpd

The following is a sample ntpd.conf file:

#The current host has an ip of 10.10.0.1
server ntpserver burst iburst minpoll 4 maxpoll 4
 
#All of the cluster hosts are peered with each other.
peer mlHost01 burst iburst minpoll 4 maxpoll 4
peer mlHost02 burst iburst minpoll 4 maxpoll 4
peer mlHost03 burst iburst minpoll 4 maxpoll 4
 
#Add the local host so the peered servers can negotiate
# and choose a host to act as the reference server
server 10.10.0.1
fudge 10.10.0.1 stratum 10

The fudge setting is used to alter the stratum of the server from the default of 0.

Choosing Between NTP Daemons

Red Hat states that chrony is the preferred NTP daemon, and should be used when possible.

Chrony should be preferred for all systems except for the systems that are managed or monitored by tools that do not support chrony, or the systems that have a hardware reference clock which cannot be used with chrony.

As always, system configuration changes should always be tested and validated prior to putting them into production use.

References

(2 vote(s))
Helpful
Not helpful

Comments (0)