to an instantiation of the protocol on a local processor, while the term
peer refers to the instantiation of the protocol on a remote processor
connected by a network path.
Figure 1<$&fig1> shows an implementation model for a host including
three processes sharing a partitioned data base, with a partition
dedicated to each peer, and interconnected by a message-passing system.
The transmit process, driven by independent timers for each peer,
collects information in the data base and sends NTP messages to the
peers. Each message contains the local timestamp when the message is
sent, together with previously received timestamps and other information
necessary to determine the hierarchy and manage the association. The
message transmission rate is determined by the accuracy required of the
local clock, as well as the accuracies of its peers.
The receive process receives NTP messages and perhaps messages in other
protocols, as well as information from directly connected radio clocks.
When an NTP message is received, the offset between the peer clock and
the local clock is computed and incorporated into the data base along
with other information useful for error determination and peer
selection. A filtering algorithm described in Section 4 improves the
accuracy by discarding inferior data.
The update procedure is initiated upon receipt of a message and at other
times. It processes the offset data from each peer and selects the best
one using the algorithms of Section 4. This may involve many
observations of a few peers or a few observations of many peers,
depending on the accuracies required.
The local-clock process operates upon the offset data produced by the
update procedure and adjusts the phase and frequency of the local clock
using the mechanisms described in Section 5. This may result in either a
step-change or a gradual phase adjustment of the local clock to reduce
the offset to zero. The local clock provides a stable source of time
information to other users of the system and for subsequent reference by
NTP itself.
Network Configurations
The synchronization subnet is a connected network of primary and
secondary time servers, clients and interconnecting transmission paths.
A primary time server is directly synchronized to a primary reference
source, usually a radio clock. A secondary time server derives
synchronization, possibly via other secondary servers, from a primary
server over network paths possibly shared with other services. Under
normal circumstances it is intended that the synchronization subnet of
primary and secondary servers assumes a hierarchical-master-slave
configuration with the primary servers at the root and secondary servers
of decreasing accuracy at successive levels toward the leaves.
Following conventions established by the telephone industry [BEL86], the
accuracy of each server is defined by a number called the stratum, with
the topmost level (primary servers) assigned as one and each level
downwards (secondary servers) in the hierarchy assigned as one greater
than the preceding level. With current technology and available radio
clocks, single-sample accuracies in the order of a millisecond can be
achieved at the network interface of a primary server. Accuracies of
this order require special care in the design and implementation of the
operating system and the local-clock mechanism, such as described in
Section 5.
As the stratum increases from one, the single-sample accuracies
achievable will degrade depending on the network paths and local-clock
stabilities. In order to avoid the tedious calculations [BRA80]
necessary to estimate errors in each specific configuration, it is
useful to assume the mean measurement errors accumulate approximately in
proportion to the measured delay and dispersion relative to the root of
the synchronization subnet. Appendix H contains an analysis of errors,
including a derivation of maximum error as a function of delay and
dispersion, where the latter quantity depends on the precision of the
timekeeping system, frequency tolerance of the local clock and various
residuals. Assuming the primary servers are synchronized to standard
time within known accuracies, this provides a reliable, determistic
specification on timekeeping accuracies throughout the synchronization
subnet.
Again drawing from the experience of the telephone industry, which
learned such lessons at considerable cost [ABA89], the synchronization
subnet topology should be organized to produce the highest accuracy, but
must never be allowed to form a loop. An additional factor is that each
increment in stratum involves a potentially unreliable time server which
introduces additional measurement errors. The selection algorithm used
in NTP uses a variant of the Bellman-Ford distributed routing algorithm
[37] to compute the minimum-weight spanning trees rooted on the primary
servers. The distance metric used by the algorithm consists of the
(scaled) stratum plus the synchronization distance, which itself
consists of the dispersion plus one-half the absolute delay. Thus, the
synchronization path will always take the minimum number of servers to
the root, with ties resolved on the basis of maximum error.
As a result of this design, the subnet reconfigures automatically in a
hierarchical-master-slave configuration to produce the most accurate and
reliable time, even when one or more primary or secondary servers or the
network paths between them fail. This includes the case where all normal
primary servers (e.g., highly accurate WWVB radio clock operating at the
lowest synchronization distances) on a possibly partitioned subnet fail,
but one or more backup primary servers (e.g., less accurate WWV radio
clock operating at higher synchronization distances) continue operation.
However, should all primary servers throughout the subnet fail, the
remaining secondary servers will synchronize among themselves while
=8= |