user/ntpclient/HOWTO

   1 The goal of ntpclient is not only to set your computer's clock
   2 right once, but keep it there.
   3
   4 First, a note on typical 1990's and 2000's computer crystals.  They
   5 are truly pathetic.  A "real" crystal oscillator (TCXO) usually has
   6 an initial set error of less than 5 ppm, and variation over time, voltage,
   7 and temperature measured in tenths of a ppm (and an OCXO can reach ±0.3 ppm
   8 stability over ten years and 85°C temperature swing).  The devices used
   9 in conventional PC motherboards and single board computers, however,
  10 often have initial set errors up to 150 ppm, and will vary 5 ppm over
  11 the course of a day-night cycle in a pseudo-air-conditioned space.
  12
  13 [Operating system software can sometimes exacerbate the problem.  I
  14 have seen some i686 Red Hat 7.3 systems run the clock at 512 Hz, or 953
  15 microseconds per tick, giving a built in 64 ppm error.  Even the normally
  16 exemplary DEC Alpha has, when run with Linux, a truly awful calibration
  17 scheme; Linux runs it with a nominal ticks per second of 1024, which
  18 gives a tick value of 977, theoretical additional error -448 ppm, actual
  19 frequency observed -443.7 ppm.]
  20
  21 Still, the pattern is clear: the first and largest error of a crystal
  22 is its initial set error.  I strongly urge the calibration of each computer,
  23 and storing its frequency error in a non-volatile medium, before you
  24 do anything else with time setting and locking.  While you could do it
  25 in a few seconds using an accurate frequency counter, below I show a
  26 software-only method using ntpclient and a high quality NTP server.
  27
  28 To perform the activities described, you need a way to control and monitor
  29 your system's clock -- both its frequency and value.  On Linux, the
  30 kernel API is described in adjtimex(2).  There are two programs that
  31 I know of that provide shell-level access to this interface, both called
  32 adjtimex(1).
  33
  34 One is written by Steven Dick and Jim Van Zandt, see the adjtimex* files in
  35 http://metalab.unc.edu/pub/Linux/system/admin/time/
  36 It uses long options, and includes some interesting functionality beyond
  37 the basic exposure of adjtimex(2).
  38
  39 I (Larry Doolittle) wrote the other; it uses short options, and has no
  40 bloat^H^H^H^H^Hextra features.  I include the code here for a standalone
  41 version; it is also incorporated into busybox (http://www.busybox.net),
  42 although you may have to select it at compile time, like any other component.
  43
  44 Fortunately (and not surprisingly) the core functions of the two adjtimex
  45 programs can be used interchangeably, as long as you only use the short option
  46 variant of the Dick/Van Zandt adjtimex.  The options discussed here are:
  47        -f    frequency (integer kernel units)
  48        -o    time offset in microseconds
  49        -t    kernel tick (microseconds per jiffy)
  50
  51 First, set the time approximately right, as root:
  52    ntpclient -s -h $NTPHOST
  53 You should see a single line printed like
  54 36765 4980.373    1341.0     39.7  956761.4    839.2  0
  55 Get used to this line: column headers are
  56  1. day since 1970
  57  2. seconds since midnight
  58  3. elapsed time for NTP transaction (microseconds)
  59  4. internal server delay (microseconds)
  60  5. clock difference between your computer the NTP server (microseconds)
  61  6. dispersion reported by server (microseconds)
  62  7. your computer's adjtimex frequency (ppm * 65536)
  63 So in the example above, your computer's clock was a bit more than
  64 0.95 seconds fast, compared to the clock on $NTPHOST.
  65 Now check that the clock setting worked.
  66    ntpclient -c 1 -h $NTPHOST
  67 36765 4993.512    1345.0     40.9    3615.3    839.2  0
  68 So now the time difference is only a few milliseconds.
  69
  70 On to measure the frequency calibration for your system.
  71 If you're in a hurry, it's OK to only spend 20 minutes on this step.
  72     ntpclient -i 60 -c 20 -h $NTPHOST >$(hostname).ntp.log &
  73
  74 Otherwise, you will learn much more about your system and its communication
  75 with the NTP server by letting the log run for 24 hours.
  76     ntpclient -i 300 -c 288 -h $NTPHOST >$(hostname).ntp.log &
  77
  78 Things to watch for in the above log:
  79
  80 If the last column (kernel frequency fine tune) ever changes, you haven't
  81 turned off other time adjustment programs.  AFAIK the only programs around
  82 that would move this number are ntpclient and xntpd.  On most out-of-the-box
  83 systems, that last column should start zero and stay zero.
  84
  85 Use gnuplot to plot the resulting file as follows:
  86    plot "HOSTNAME.ntp.log" using (($1-36765)*86400+$2):5:($3+$6) with yerrorbars
  87 This shows time error (microseconds) as a function of elapsed time (seconds).
  88 The error bars show the uncertainty in the measurement.  Ideally, it would
  89 be a smooth, straight line, where the slope represents the frequency error
  90 of your crystal.
  91
  92 If an occasional point is both off-center and has a large error bar, it shows
  93 a transaction got delayed somewhere in the process, either inside the server,
  94 or one of the two UDP packet propagation steps.  This is normal, and ntpclient
  95 can deal with those quite well.  If points are not evenly spaced on the
  96 horizontal axis, packets were actually lost; this is less common, but still OK.
  97
  98 If the error bar becomes suddenly large, and takes a few minutes to slowly
  99 recover, your NTP host (presumably xntpd) had problems communicating with
 100 _its_ server, and reported that problem to you by increasing its "dispersion"
 101 (this is a hack, required by xntpd's core incorrect assumption that errors
 102 in network delays have Gaussian statistics; ntpclient does not have this flaw).
 103
 104 If there are sudden large, persistent steps in error, some other program is
 105 making step changes to time.  Check for, e.g., ntpdate run as a cron job.
 106 If your client machine is OK, check for problems on the _host_ machine.
 107
 108 Assuming the graph above is clean, and has non-garbled data for the first
 109 and last points, you can run it through the enclosed awk script (rate.awk)
 110 to determine the appropriate frequency value.
 111 $ awk -f rate.awk <test.dat
 112 delta-t 119400 seconds
 113 delta-o -142308 useconds
 114 slope -1.19186 ppm
 115 old frequency -1240000 ( -18.9209 ppm)
 116 new frequency -1318109 ( -20.1127 ppm)
 117 $
 118
 119 For now, you should plug in the new frequency value
 120    adjtimex -f -1318109
 121 Then reset the clock
 122    ntpclient -c 1 -h $NTPHOST
 123 and ponder how it makes sense in _your_ (possibly embedded) environment
 124 to have the number -1318109 applied via adjtimex every time your machine
 125 boots.  If the frequency offset (absolute value) is greater than about
 126 140 ppm (9175040), you have a problem: you may be able to fix it with
 127 the -t option to adjtimex, or you need to hack phaselock.c, that has a
 128 maximum adjustment extent of +/- 150 ppm built into phaselock.c (change
 129 the #define MAX_CORRECT and rebuild nptclient).  I'd like to suggest that
 130 you replace the defective crystal instead, but I understand that is rarely
 131 practical.
 132
 133 On to ntpclient -l.  This is actually easy, if you performed and understood
 134 the previous steps.  Run
 135   ntpclient -l -h $NTPHOST
 136 in the background.  It will make small (probably less than 3 ppm) adjustments
 137 to the system frequency to keep the clocks locked.  Typical performance over
 138 Ethernet (even through a few routers) is a worst case error of +/- 10 ms.
 139
 140 I won't try to tell you _where_ to put the boot time commands.  They should
 141 boil down to:
 142    adjtimex -f $NONVOLATILE_MEMORY_VALUE
 143    ntpclient -s -i 1 -g 10000 -h $NTPHOST
 144    ntpclient -l -h $NTPHOST >some_log_file
 145 The second line makes explicit the retries that may be required for this
 146 UDP-based time protocol.  If the first time request takes longer than 10000
 147 microseconds to resolve, or the packets get lost, it instructs ntpclient to
 148 try again one second later, and it won't exit until it gets such a suitable
 149 response.
 150
 151 It's an interesting question how sensitive the boot process should be
 152 to the time set process.  If you have a battery backed hardware clock,
 153 there's not much problem running for a while without a network-accurate
 154 system clock.  In that case you could put both ntpclient commands into a
 155 background script, and the only possible issue is the sudden (but probably
 156 small) warp of the clock at the indefinite time in the boot sequence when
 157 ntpclient gets its acceptable answer.  On the other hand, some embedded
 158 computers have no clue what time it is until the network responds.  Any
 159 files created will be marked Jan 1 1970, and other application-dependent
 160 issues may arise if there is a nonsense time on the system during later
 161 parts of the boot sequence.  Then you may well want to enforce completion
 162 of the first ntpclient before starting your application.  If this is too
 163 drastic for you, and you want a fallback mode when the time server is dead,
 164 add a "-c 5" switch to the end of that ntpclient command, giving at most 5
 165 retries, and therefore 5 seconds delay, if something goes wrong with the
 166 time set.