f



My PC time server suddenly wants to take unscheduled joy rides... (weird jitter/offset issues)

Dear group:

I have devoted a PC to be a time server for my little home network.  It did=
n't need to be really accurate; I maintain it mostly just for fun.

Anyway, this server, while not particularly accurate (about 35 ppm), has ne=
vertheless been extremely reliable, and the network reliable enough to give=
 me single-digit jitter with very consistent offsets from the five internet=
 sources I sync to.

Suddenly I've been experiencing weird jaunts where the offsets jump by 100 =
ms or more and stay there for a bit, before finally drifting to levels they=
 were at when it was healthy.  Then after a few minutes, or a few hours - v=
ery irregular intervals - it starts all over again.  It is always heralded =
by jitter suddenly skyrocketing, even though the offsets are rock-steady. T=
hen within the next round or so of queries, the offsets suddenly jump - all=
 of them!
What do you think might be the cause?

Any comments, ideas and/or advice is welcome.  Thanks for taking the time t=
o reply!

Bill
0
Bill
7/20/2016 5:36:50 AM
comp.protocols.time.ntp 4895 articles. 2 followers. Post Follow

48 Replies
557 Views

Similar Articles

[PageSpeed] 11

I forgot to say I am running Meinberg's NTP version 4.2.8 p8 and that great little monitor of theirs.

Thanks, Meinberg!
0
Bill
7/20/2016 5:40:04 AM
Bill,

Bill Ko schrieb:
> I forgot to say I am running Meinberg's NTP version 4.2.8 p8 and that great little monitor of theirs.

Please keep in mind that the NTP software for Windows has not been
written by Meinberg. It's just the standard NTP package maintained by
the NTP project at
http://www.ntp.org

We at Meinberg support the NTP project. We also pick up the source code
release, build the binaries for Windows, and put them into a GUI setup
program to simplify installation under Windows:
http://www.meinbergglobal.com/english/sw/ntp.htm#ntp_stable

Martin
-- 
Martin Burnicki

Meinberg Funkuhren
Bad Pyrmont
Germany
0
Martin
7/20/2016 7:12:18 AM
Bill,

Bill Ko wrote:
> Dear group:
> 
> I have devoted a PC to be a time server for my little home network.  It didn't need to be really accurate; I maintain it mostly just for fun.
> 
> Anyway, this server, while not particularly accurate (about 35 ppm), ...

Please note the mean frequency offset (35 ppm in your case) tells how
much the quartz oscillator on your mainboard is off its nominal
frequency. This causes the undisciplined clock to drift.

The value reported by ntpd is the frequency offset ntpd has determined,
and which it tries to compensate to minimize the system clock drift. So
the absolute value doesn't really matter as long as ntpd can accurately
determine and compensate it.

Please note the quartz frequency changes more or less with the ambient
temperature, and the frequency offset reported by ntpd changes similarly
to compensate the variations.

.... has nevertheless been extremely reliable, and the network reliable
enough to give me single-digit jitter with very consistent offsets from
the five internet sources I sync to.
> 
> Suddenly I've been experiencing weird jaunts where the offsets jump by 100 ms or more and stay there for a bit, before finally drifting to levels they were at when it was healthy.  Then after a few minutes, or a few hours - very irregular intervals - it starts all over again.  It is always heralded by jitter suddenly skyrocketing, even though the offsets are rock-steady. Then within the next round or so of queries, the offsets suddenly jump - all of them!
> What do you think might be the cause?

I don't know.

Eventually there are some hints in the log entries ntpd writes to the
Windows application event log.

You could als try to figure out what happens if you periodically run the
commands

ntpq -c "rv &1"
ntpq -c "rv &2"
....

and look at the displayed filter values. In the command above, '&1',
'&2' refer to the 1st and 2nd 'server' entries in your ntp.conf file,
and so on.

Alternatively you can let ntpd generate some statistics files by adding
the following lines to ntp.conf:

----------------------------------------------------
statsdir "C:\Program Files\NTP\etc\"
filegen peerstats  file peerstats  type week enable
filegen loopstats  file loopstats  type week enable
filegen clockstats file clockstats type week enable
filegen rawstats   file rawstats   type week enable
----------------------------------------------------

where the specified 'statsdir' should point to the etc\ subdirectory
below the real installation path of the NTP software.

If you have several NTP sources/servers configured and all of them show
the same time stap at the same time then something is happening with
your system time. Eventually another program changes the time?

Martin
-- 
Martin Burnicki

Meinberg Funkuhren
Bad Pyrmont
Germany
0
Martin
7/20/2016 8:02:23 AM
Bill Ko <sirstray@gmail.com> wrote:
> I forgot to say I am running Meinberg's NTP version 4.2.8 p8 and that great little monitor of theirs.
>
> Thanks, Meinberg!

You should try installing Linux on that PC and see how that works.
Or buy a Raspberry Pi, with Linux on it, and use that as your timeserver.
Might save you a lot on your electricity bill as well.
0
Rob
7/20/2016 8:43:11 AM
On 20/07/2016 09:02, Martin Burnicki wrote:
[]
> Alternatively you can let ntpd generate some statistics files by adding
> the following lines to ntp.conf:
>
> ----------------------------------------------------
> statsdir "C:\Program Files\NTP\etc\"
> filegen peerstats  file peerstats  type week enable
> filegen loopstats  file loopstats  type week enable
> filegen clockstats file clockstats type week enable
> filegen rawstats   file rawstats   type week enable
> ----------------------------------------------------
>
> where the specified 'statsdir' should point to the etc\ subdirectory
> below the real installation path of the NTP software.
>
> If you have several NTP sources/servers configured and all of them show
> the same time stap at the same time then something is happening with
> your system time. Eventually another program changes the time?
>
> Martin

I have found issues with having NTP in C:\Windows in Win-10 (possibly 
from Win-8 upwards).  Consequently I now recommend installing to:

   C:\Tools\NTP

rather than C:\Program Files\NTP.  This allows the user to edit the 
ntp.conf file more easily.

   http://www.satsignal.eu/ntp/setup.html

-- 
Cheers,
David
Web: http://www.satsignal.eu
0
David
7/20/2016 10:21:59 AM
On 20/07/2016 07:36, Bill Ko wrote:
> Dear group:
>
> I have devoted a PC to be a time server for my little home network.  It didn't need to be really accurate; I maintain it mostly just for fun.
>
> Anyway, this server, while not particularly accurate (about 35 ppm), has nevertheless been extremely reliable, and the network reliable enough to give me single-digit jitter with very consistent offsets from the five internet sources I sync to.
>
> Suddenly I've been experiencing weird jaunts where the offsets jump by 100 ms or more and stay there for a bit, before finally drifting to levels they were at when it was healthy.  Then after a few minutes, or a few hours - very irregular intervals - it starts all over again.  It is always heralded by jitter suddenly skyrocketing, even though the offsets are rock-steady. Then within the next round or so of queries, the offsets suddenly jump - all of them!
> What do you think might be the cause?
>
> Any comments, ideas and/or advice is welcome.  Thanks for taking the time to reply!
>

Where are you getting the time from?

To me this sounds like NTPD is receiving slightly different time from
two sources and is somewhat undecided which one to trust.

Look at the output from the command ntpq -p and see if any of the time
sources are too far off (offset column differs a lot from the others,
or jitter column has a much larger value than the others).


Enjoy

Jakob
-- 
Jakob Bohm, CIO, Partner, WiseMo A/S.  https://www.wisemo.com
Transformervej 29, 2860 Søborg, Denmark.  Direct +45 31 13 16 10
This public discussion message is non-binding and may contain errors.
WiseMo - Remote Service Management for PCs, Phones and Embedded
0
Jakob
7/20/2016 11:21:11 AM
Just want to let everyone know I appreciate their input.  I'm off to work, =
but you've given me some stuff to chew on, as well as some good information=
 in general.

Sorry, Mr. Burnicki, if my post sounded sarcastic - actually it was genuine=
 praise for the monitor program!  When you mentioned that you just build th=
e binaries and make it a user-friendly install, I kind of suspected that.  =
I was just trying to give as much information as possible (even though it s=
eems that there's always SOMEthing important that I'd forgotten to mention =
- I wonder what it will be this time).

Thanks, everyone!
0
Bill
7/20/2016 12:19:59 PM
Bill,

Bill Ko wrote:
> Just want to let everyone know I appreciate their input.  I'm off to work, but you've given me some stuff to chew on, as well as some good information in general.
> 
> Sorry, Mr. Burnicki, if my post sounded sarcastic ... 

Please just call me Martin.

No need to be sorry. What you wrote didn't sound sarcatic to me. ;-)

> ... - actually it was genuine praise for the monitor program!  When you mentioned that you just build the binaries and make it a user-friendly install, I kind of suspected that.  I was just trying to give as much information as possible (even though it seems that there's always SOMEthing important that I'd forgotten to mention - I wonder what it will be this time).

What I wrote was related to the NTP software package, which contains the
NTP service (ntpd) and is maintained by the NTP project.

The monitor program for Windows has been written by one of my colleagues
some time ago. You can use it in addition to the NTP package to monitor
what the NTP service is doing, but this is optional and ntpd can also be
used without the monitor program.

Martin
-- 
Martin Burnicki

Meinberg Funkuhren
Bad Pyrmont
Germany
0
Martin
7/20/2016 3:15:13 PM
David,

David Taylor wrote:
> I have found issues with having NTP in C:\Windows in Win-10 (possibly
> from Win-8 upwards).  Consequently I now recommend installing to:
> 
>   C:\Tools\NTP
> 
> rather than C:\Program Files\NTP.  This allows the user to edit the
> ntp.conf file more easily.
> 
>   http://www.satsignal.eu/ntp/setup.html

I know.

The directory of these files has been discussed here in the NG quite
some years ago, and the installer was configured according to the
results of this discussion.

Eventually it's time to change this, but the question is which are the
correct directories for which files?

They need to be accessible by a user and by ntpd, regardless if ntpd
runs under the system account or under a special NTP user account.


Martin
-- 
Martin Burnicki

Meinberg Funkuhren
Bad Pyrmont
Germany
0
Martin
7/20/2016 3:19:36 PM
On 20/07/2016 16:19, Martin Burnicki wrote:
> David,
>
> David Taylor wrote:
>> I have found issues with having NTP in C:\Windows in Win-10 (possibly
>> from Win-8 upwards).  Consequently I now recommend installing to:
>>
>>   C:\Tools\NTP
>>
>> rather than C:\Program Files\NTP.  This allows the user to edit the
>> ntp.conf file more easily.
>>
>>   http://www.satsignal.eu/ntp/setup.html
>
> I know.
>
> The directory of these files has been discussed here in the NG quite
> some years ago, and the installer was configured according to the
> results of this discussion.
>
> Eventually it's time to change this, but the question is which are the
> correct directories for which files?
>
> They need to be accessible by a user and by ntpd, regardless if ntpd
> runs under the system account or under a special NTP user account.
>
>
> Martin

Martin,

The solution I suggested is a workround.  I have, use, and even write 
lots of programs which require the user to edit files in the same 
directory as the program lives.  This is no longer in accordance with 
the Microsoft standard which says that user-editable files should be in:

   %appdata%

which in this system translates to:

   C:\Users\David\AppData\Roaming

meaning (I think) that it's shared across all my PCs with the same 
account ("roaming"), and that it's specific to me.  We really need the 
equivalent for "all users" and per machine, not per login.  I hope that 
someone more familiar with the current standards can help us out, as I 
could well be wrong here.

What files need to be accessible and by whom?  A list would be a good 
starting point.  For me, I would like to have at least read-access to 
the drift file, and edit access to the ntp.conf file, but I can see that 
in a locked-down configuration perhaps ntp.conf should need 
Administrator level access.  There are more than a thousand users of 
this software to provide good timekeeping for the Plane Plotter program, 
and some of them struggle with editing a .TXT file, let alone trying to 
find out how to get Administrator access!

Read and execute access to the .exe files would be automatic if they 
were placed in a suitable directory on the path.  Personally, I don't 
like programs which extend the "path".  On this PC the path is 1434 
bytes long, which is crazy, but otherwise it means putting files into 
C:\Windows or C:\Windows\System32.  I see my own NTP executables are in 
C:\Tools\NTP\bin.

I'd welcome a discussion on this, so I've changed the title.

-- 
Cheers,
David
Web: http://www.satsignal.eu
0
David
7/20/2016 4:23:34 PM
On 20/07/2016 17:19, Martin Burnicki wrote:
> David,
>
> David Taylor wrote:
>> I have found issues with having NTP in C:\Windows in Win-10 (possibly
>> from Win-8 upwards).  Consequently I now recommend installing to:
>>
>>   C:\Tools\NTP
>>
>> rather than C:\Program Files\NTP.  This allows the user to edit the
>> ntp.conf file more easily.
>>
>>   http://www.satsignal.eu/ntp/setup.html
>
> I know.
>
> The directory of these files has been discussed here in the NG quite
> some years ago, and the installer was configured according to the
> results of this discussion.
>
> Eventually it's time to change this, but the question is which are the
> correct directories for which files?
>
> They need to be accessible by a user and by ntpd, regardless if ntpd
> runs under the system account or under a special NTP user account.
>
>
> Martin
>

Standard Windows practice (similar to the path name standard on Linux)
is:

System global data and config files that change outside a run of the
setup program, equivalent to /etc/opt/product/ or
/usr/local/etc/product/ :
    %ProgramData%\Vendor\Product\
    (or in the registry under
       HKEY_LOCAL_MACHINE\Software\Vendor\Product\)
Constant data supplied by the setup program and not changeable later,
including program files and DLLs (equivalent to /usr/local/bin,
/usr/local/lib etc.):
    %ProgramFiles%\Vendor\Product\

Where:
    %ProgramData% is the directory returned by passing the appropriate
       identifier to the API in shfolder.dll and/or available as the
       CommonAppDataFolder property in MSI packages. iI defaults to
       C:\ProgramData in current English language Windows versions.
    %ProgramFiles% is the directory returned by passing the appropriate
       identifier to the API in shfolder.dll and/or available as the
       %ProgramFilesFolder% property in MSI packages.  It defaults to
       "C:\Program Files" in current English language Windows versions.
    Vendor is the program vendor or program family, e.g. "NTP" or
       "Meinberg".
    Product is the specific product within the vendor subcategory, e.g.
       "ntpd".

Permissions on %ProgramData%\Vendor\Product\ and its subdirs can be
set at install time, e.g. to restrict write access to members of the
Administrators group and the NTP daemon account, Read access should
also be restricter for any subdirectory containing private keys.

Note that on Windows Vista or later the "UAC" feature is like sudo on
Linux/BSD, being invoked if the launched program has been linked with
an XML manifest saying it should be run with Administrator privileges,
and signed by whomever compiler/released the binary.  Windows XP with
service pack 3 will ignore the sudo XML attribute as it doesn't have
this sudo/UAC feature.

For example, if editing ntp.conf requires admin privileges, then either
the dedicated editing program should be linked in this way, or a stub
program that opens the file in the system default text editor should be
included and so marked.  This stub program should be added to the start
menu with an appropriate title (Only Windows Vista allowed start menu
entries to request sudo launch of non-UAC programs, Windows 8+ has no
working start menu, but it is a popular 3rd party add-on).


Since it has not bearing on NTPD, I will spare you the information for
per user files (which is similar but different to how the home dir is
used on POSIX systems).



Enjoy

Jakob
-- 
Jakob Bohm, CIO, Partner, WiseMo A/S.  https://www.wisemo.com
Transformervej 29, 2860 Søborg, Denmark.  Direct +45 31 13 16 10
This public discussion message is non-binding and may contain errors.
WiseMo - Remote Service Management for PCs, Phones and Embedded
0
Jakob
7/20/2016 4:55:21 PM
On 20/07/16 06:36, Bill Ko wrote:
> Suddenly I've been experiencing weird jaunts where the offsets jump by 100 ms or more and stay there for a bit, before finally drifting to levels they were at when it was healthy.  Then after a few minutes, or a few hours - very irregular intervals - it starts all over again.  It is always heralded by jitter suddenly skyrocketing, even though the offsets are rock-steady. Then within the next round or so of queries, the offsets suddenly jump - all of them!
> What do you think might be the cause?

That's typical of a heavy network load being imposed between you and 
internet and then removed.

The round trip time becomes asymmetric, so halving the round trip time 
no longer gives the correct time delay from the servers.  The measured 
time is now wrong.  ntpd sees the offset from the measured time grow, 
and starts to adjust its internal time to match the measured time but 
only slowly, so it isn't excessively influenced by such transients, and 
the offset start to grow, as the error in the internal time grows to 
match the error in the measured time.  When the load is removed, the 
measured time in now right, and the internal time is wrong, so there is 
a high offset, which, again, eventually clears.

This is why offset is overrated as a quality measure.
0
David
7/21/2016 11:28:07 PM
David Woolley wrote:
> On 20/07/16 06:36, Bill Ko wrote:
>> Suddenly I've been experiencing weird jaunts where the offsets jump by
>> 100 ms or more and stay there for a bit, before finally drifting to
>> levels they were at when it was healthy.  Then after a few minutes, or
>> a few hours - very irregular intervals - it starts all over again.  It
>> is always heralded by jitter suddenly skyrocketing, even though the
>> offsets are rock-steady. Then within the next round or so of queries,
>> the offsets suddenly jump - all of them!
>> What do you think might be the cause?
>
> That's typical of a heavy network load being imposed between you and
> internet and then removed.
>
> The round trip time becomes asymmetric, so halving the round trip time
> no longer gives the correct time delay from the servers.  The measured
> time is now wrong.  ntpd sees the offset from the measured time grow,
> and starts to adjust its internal time to match the measured time but
> only slowly, so it isn't excessively influenced by such transients, and
> the offset start to grow, as the error in the internal time grows to
> match the error in the measured time.  When the load is removed, the
> measured time in now right, and the internal time is wrong, so there is
> a high offset, which, again, eventually clears.
>
> This is why offset is overrated as a quality measure.

The HUFF PUFF filter was designed specifically for this situation, the 
key is that you must have a fairly stable base frequency offset so that 
this filter can discard a few hours of samples where the offset can be 
explained by asymmetrical network delays.

Terje

-- 
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"
0
Terje
7/22/2016 9:25:55 AM
On 22/07/2016 10:25, Terje Mathisen wrote:
[]
> The HUFF PUFF filter was designed specifically for this situation, the
> key is that you must have a fairly stable base frequency offset so that
> this filter can discard a few hours of samples where the offset can be
> explained by asymmetrical network delays.
>
> Terje

I tried that many years back (when I didn't have my own stratum-1 and it 
made things worse.  But no harm in testing.

-- 
Cheers,
David
Web: http://www.satsignal.eu
0
David
7/22/2016 10:23:03 AM
David Taylor wrote:
> On 22/07/2016 10:25, Terje Mathisen wrote:
> []
>> The HUFF PUFF filter was designed specifically for this situation, the
>> key is that you must have a fairly stable base frequency offset so that
>> this filter can discard a few hours of samples where the offset can be
>> explained by asymmetrical network delays.
>>
>> Terje
>
> I tried that many years back (when I didn't have my own stratum-1 and it
> made things worse.  But no harm in testing.
>
Huffpuff needs careful setup, you must have a time constant which is 
larger than the maximum time you have network imbalance/asymmetric 
delays, and the local frequency stability must be sufficient to actually 
let you "coast" over those periods of stuffed network links.

Terje

-- 
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"
0
Terje
7/22/2016 1:23:01 PM
On 22/07/2016 14:23, Terje Mathisen wrote:
[]
> Huffpuff needs careful setup, you must have a time constant which is
> larger than the maximum time you have network imbalance/asymmetric
> delays, and the local frequency stability must be sufficient to actually
> let you "coast" over those periods of stuffed network links.
>
> Terje

Likely I didn't spend a of of time playing with it.  Having a GPS/PPS 
locked stratum-1 server locally was a much better solution, and now they 
are very easy and low-cost to make with the Raspberry Pi and similar 
cards, so "everyone should have one".

-- 
Cheers,
David
Web: http://www.satsignal.eu
0
David
7/22/2016 4:25:32 PM
On Thursday, July 21, 2016 at 7:28:14 PM UTC-4, David Woolley wrote:
> On 20/07/16 06:36, Bill Ko wrote:
> > Suddenly I've been experiencing weird jaunts where the offsets jump by =
100 ms or more and stay there for a bit, before finally drifting to levels =
they were at when it was healthy.  Then after a few minutes, or a few hours=
 - very irregular intervals - it starts all over again.  It is always heral=
ded by jitter suddenly skyrocketing, even though the offsets are rock-stead=
y. Then within the next round or so of queries, the offsets suddenly jump -=
 all of them!
> > What do you think might be the cause?
>=20
> That's typical of a heavy network load being imposed between you and=20
> internet and then removed.
>=20
> The round trip time becomes asymmetric, so halving the round trip time=20
> no longer gives the correct time delay from the servers.  The measured=20
> time is now wrong.  ntpd sees the offset from the measured time grow,=20
> and starts to adjust its internal time to match the measured time but=20
> only slowly, so it isn't excessively influenced by such transients, and=
=20
> the offset start to grow, as the error in the internal time grows to=20
> match the error in the measured time.  When the load is removed, the=20
> measured time in now right, and the internal time is wrong, so there is=
=20
> a high offset, which, again, eventually clears.
>=20
> This is why offset is overrated as a quality measure.

I was kind of coming to that conclusion, myself.  I think something changed=
 with my ISP and load balancing.  Maybe there's suddenly a lot of network t=
raffic and I get shifted around a lot.  The network itself doesn't appear s=
aturated, but it could be that rerouting me all over the place makes it ext=
remely difficult for NTP to do its job.
0
Bill
7/23/2016 4:45:15 AM
On Friday, July 22, 2016 at 5:25:57 AM UTC-4, Terje Mathisen wrote:
> David Woolley wrote:
> > On 20/07/16 06:36, Bill Ko wrote:
> >> Suddenly I've been experiencing weird jaunts where the offsets jump by
> >> 100 ms or more and stay there for a bit, before finally drifting to
> >> levels they were at when it was healthy.  Then after a few minutes, or
> >> a few hours - very irregular intervals - it starts all over again.  It
> >> is always heralded by jitter suddenly skyrocketing, even though the
> >> offsets are rock-steady. Then within the next round or so of queries,
> >> the offsets suddenly jump - all of them!
> >> What do you think might be the cause?
> >
> > That's typical of a heavy network load being imposed between you and
> > internet and then removed.
> >
> > The round trip time becomes asymmetric, so halving the round trip time
> > no longer gives the correct time delay from the servers.  The measured
> > time is now wrong.  ntpd sees the offset from the measured time grow,
> > and starts to adjust its internal time to match the measured time but
> > only slowly, so it isn't excessively influenced by such transients, and
> > the offset start to grow, as the error in the internal time grows to
> > match the error in the measured time.  When the load is removed, the
> > measured time in now right, and the internal time is wrong, so there is
> > a high offset, which, again, eventually clears.
> >
> > This is why offset is overrated as a quality measure.
> 
> The HUFF PUFF filter was designed specifically for this situation, the 
> key is that you must have a fairly stable base frequency offset so that 
> this filter can discard a few hours of samples where the offset can be 
> explained by asymmetrical network delays.
> 
> Terje
> 
> -- 
> - <Terje.Mathisen at tmsw.no>
> "almost all programming can be viewed as an exercise in caching"

I was thinking about implementing this, if indeed it was a network issue.  As I had mentioned above, maybe the network is seriously unstable now and I need to resort this tinker thing.
0
Bill
7/23/2016 4:47:06 AM
On Saturday, July 23, 2016 at 12:47:08 AM UTC-4, Bill Ko wrote:
> On Friday, July 22, 2016 at 5:25:57 AM UTC-4, Terje Mathisen wrote:
> > David Woolley wrote:
> > > On 20/07/16 06:36, Bill Ko wrote:
> > >> Suddenly I've been experiencing weird jaunts where the offsets jump by
> > >> 100 ms or more and stay there for a bit, before finally drifting to
> > >> levels they were at when it was healthy.  Then after a few minutes, or
> > >> a few hours - very irregular intervals - it starts all over again.  It
> > >> is always heralded by jitter suddenly skyrocketing, even though the
> > >> offsets are rock-steady. Then within the next round or so of queries,
> > >> the offsets suddenly jump - all of them!
> > >> What do you think might be the cause?
> > >
> > > That's typical of a heavy network load being imposed between you and
> > > internet and then removed.
> > >
> > > The round trip time becomes asymmetric, so halving the round trip time
> > > no longer gives the correct time delay from the servers.  The measured
> > > time is now wrong.  ntpd sees the offset from the measured time grow,
> > > and starts to adjust its internal time to match the measured time but
> > > only slowly, so it isn't excessively influenced by such transients, and
> > > the offset start to grow, as the error in the internal time grows to
> > > match the error in the measured time.  When the load is removed, the
> > > measured time in now right, and the internal time is wrong, so there is
> > > a high offset, which, again, eventually clears.
> > >
> > > This is why offset is overrated as a quality measure.
> > 
> > The HUFF PUFF filter was designed specifically for this situation, the 
> > key is that you must have a fairly stable base frequency offset so that 
> > this filter can discard a few hours of samples where the offset can be 
> > explained by asymmetrical network delays.
> > 
> > Terje
> > 
> > -- 
> > - <Terje.Mathisen at tmsw.no>
> > "almost all programming can be viewed as an exercise in caching"
> 
> I was thinking about implementing this, if indeed it was a network issue.  As I had mentioned above, maybe the network is seriously unstable now and I need to resort this tinker thing.

Trouble is, I can't honestly say the periods of "lucidity" are longer than the periods of "insanity".  Kind of like myself.
0
Bill
7/23/2016 4:55:26 AM
On Saturday, July 23, 2016 at 12:55:27 AM UTC-4, Bill Ko wrote:
> On Saturday, July 23, 2016 at 12:47:08 AM UTC-4, Bill Ko wrote:
> > On Friday, July 22, 2016 at 5:25:57 AM UTC-4, Terje Mathisen wrote:
> > > David Woolley wrote:
> > > > On 20/07/16 06:36, Bill Ko wrote:
> > > >> Suddenly I've been experiencing weird jaunts where the offsets jum=
p by
> > > >> 100 ms or more and stay there for a bit, before finally drifting t=
o
> > > >> levels they were at when it was healthy.  Then after a few minutes=
, or
> > > >> a few hours - very irregular intervals - it starts all over again.=
  It
> > > >> is always heralded by jitter suddenly skyrocketing, even though th=
e
> > > >> offsets are rock-steady. Then within the next round or so of queri=
es,
> > > >> the offsets suddenly jump - all of them!
> > > >> What do you think might be the cause?
> > > >
> > > > That's typical of a heavy network load being imposed between you an=
d
> > > > internet and then removed.
> > > >
> > > > The round trip time becomes asymmetric, so halving the round trip t=
ime
> > > > no longer gives the correct time delay from the servers.  The measu=
red
> > > > time is now wrong.  ntpd sees the offset from the measured time gro=
w,
> > > > and starts to adjust its internal time to match the measured time b=
ut
> > > > only slowly, so it isn't excessively influenced by such transients,=
 and
> > > > the offset start to grow, as the error in the internal time grows t=
o
> > > > match the error in the measured time.  When the load is removed, th=
e
> > > > measured time in now right, and the internal time is wrong, so ther=
e is
> > > > a high offset, which, again, eventually clears.
> > > >
> > > > This is why offset is overrated as a quality measure.
> > >=20
> > > The HUFF PUFF filter was designed specifically for this situation, th=
e=20
> > > key is that you must have a fairly stable base frequency offset so th=
at=20
> > > this filter can discard a few hours of samples where the offset can b=
e=20
> > > explained by asymmetrical network delays.
> > >=20
> > > Terje
> > >=20
> > > --=20
> > > - <Terje.Mathisen at tmsw.no>
> > > "almost all programming can be viewed as an exercise in caching"
> >=20
> > I was thinking about implementing this, if indeed it was a network issu=
e.  As I had mentioned above, maybe the network is seriously unstable now a=
nd I need to resort this tinker thing.
>=20
> Trouble is, I can't honestly say the periods of "lucidity" are longer tha=
n the periods of "insanity".  Kind of like myself.

I noticed in my first post that I had incorrectly stated that the offsets w=
ere rock-steady.

So to clear up any confusion, the problem happens like this... first I noti=
ce jitter starting to get bad - a sudden jump to double-digits - then, with=
in the next polling time or two, for each server I query, the offsets start=
 going nuts.  All this time, though, the network delay stays very steady.

This might now point you in a different direction?

Sorry about that.
0
Bill
7/23/2016 5:41:14 AM
On 23/07/16 06:41, Bill Ko wrote:
> I noticed in my first post that I had incorrectly stated that the offsets were rock-steady.
>
> So to clear up any confusion, the problem happens like this... first I notice jitter starting to get bad - a sudden jump to double-digits - then, within the next polling time or two, for each server I query, the offsets start going nuts.  All this time, though, the network delay stays very steady.
>
> This might now point you in a different direction?

Most of the other statistics are interlinked.

However, if round trip time is stable and low, that suggests either a 
temperature change or a device driver with interrupt latency problems 
that is causing clock interrupts to be lost.
0
David
7/23/2016 9:29:18 AM
On Saturday, July 23, 2016 at 5:29:20 AM UTC-4, David Woolley wrote:
> On 23/07/16 06:41, Bill Ko wrote:
> > I noticed in my first post that I had incorrectly stated that the offse=
ts were rock-steady.
> >
> > So to clear up any confusion, the problem happens like this... first I =
notice jitter starting to get bad - a sudden jump to double-digits - then, =
within the next polling time or two, for each server I query, the offsets s=
tart going nuts.  All this time, though, the network delay stays very stead=
y.
> >
> > This might now point you in a different direction?
>=20
> Most of the other statistics are interlinked.
>=20
> However, if round trip time is stable and low, that suggests either a=20
> temperature change or a device driver with interrupt latency problems=20
> that is causing clock interrupts to be lost.

Hmmm, two significant events appeared to have happened directly before this=
 problem happened.

1) there were some Windows Updates installed, and

2) my UPS put my time server into hibernation because of a power outage.  t=
he pc was down for about 6 hours.

when it came back, i saw the time server was behaving badly.  the first thi=
ng i did was to shut down Windows, let the machine turn itself off, then un=
plug it until i saw the network light stop blinking.  at that point, i assu=
med the capacitors in the power supply and motherboard had drained; sometim=
es a hardware issue can survive even a power off as long as the power suppl=
y as power going to it.  unfortunately that did not help.

i can't help but think that if it was the Windows update, a google search w=
ould find SOMEone who was complaining besides myself, but so far it seems i=
 am the only data point thus far.
0
Bill
7/24/2016 7:06:09 AM
On Sunday, July 24, 2016 at 3:06:11 AM UTC-4, Bill Ko wrote:
> On Saturday, July 23, 2016 at 5:29:20 AM UTC-4, David Woolley wrote:
> > On 23/07/16 06:41, Bill Ko wrote:
> > > I noticed in my first post that I had incorrectly stated that the off=
sets were rock-steady.
> > >
> > > So to clear up any confusion, the problem happens like this... first =
I notice jitter starting to get bad - a sudden jump to double-digits - then=
, within the next polling time or two, for each server I query, the offsets=
 start going nuts.  All this time, though, the network delay stays very ste=
ady.
> > >
> > > This might now point you in a different direction?
> >=20
> > Most of the other statistics are interlinked.
> >=20
> > However, if round trip time is stable and low, that suggests either a=
=20
> > temperature change or a device driver with interrupt latency problems=
=20
> > that is causing clock interrupts to be lost.
>=20
> Hmmm, two significant events appeared to have happened directly before th=
is problem happened.
>=20
> 1) there were some Windows Updates installed, and
>=20
> 2) my UPS put my time server into hibernation because of a power outage. =
 the pc was down for about 6 hours.
>=20
> when it came back, i saw the time server was behaving badly.  the first t=
hing i did was to shut down Windows, let the machine turn itself off, then =
unplug it until i saw the network light stop blinking.  at that point, i as=
sumed the capacitors in the power supply and motherboard had drained; somet=
imes a hardware issue can survive even a power off as long as the power sup=
ply as power going to it.  unfortunately that did not help.
>=20
> i can't help but think that if it was the Windows update, a google search=
 would find SOMEone who was complaining besides myself, but so far it seems=
 i am the only data point thus far.

Oh yes - the environment has not changed any, and all the thermal sensors i=
n the case and in the hard drives don't report any unusual temperatures, so=
 i figure it should still be good enough to be a stable ntp server.

i'm baffled.  it's always "just worked" so the thought of becoming an exper=
t at ntp really hadn't occurred to me.  i am willing and able to learn more=
 about ntp so i don't have to bug you guys so much.
0
Bill
7/24/2016 7:10:49 AM
Bill,

I may have missed it, but have you ever shown us the output from an 
"ntpq -pn" command?

-- 
Cheers,
David
Web: http://www.satsignal.eu
0
David
7/24/2016 3:53:43 PM
On Sunday, July 24, 2016 at 11:53:44 AM UTC-4, David Taylor wrote:
> Bill,
> 
> I may have missed it, but have you ever shown us the output from an 
> "ntpq -pn" command?
> 
> -- 
> Cheers,
> David
> Web: http://www.satsignal.eu

I'm going to have one from when it is stable, one from when it is transitioning and one when it is bonkers.  Hang on...
0
Bill
7/24/2016 8:38:57 PM
On Sunday, July 24, 2016 at 4:39:00 PM UTC-4, Bill Ko wrote:
> On Sunday, July 24, 2016 at 11:53:44 AM UTC-4, David Taylor wrote:
> > Bill,
> > 
> > I may have missed it, but have you ever shown us the output from an 
> > "ntpq -pn" command?
> > 
> > -- 
> > Cheers,
> > David
> > Web: http://www.satsignal.eu
> 
> I'm going to have one from when it is stable, one from when it is transitioning and one when it is bonkers.  Hang on...

Okay, here's some data:

This is when it is functioning normally:

     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 127.127.1.0     .LOCL.          12 l 187m   64    0    0.000    0.000   0.000
+209.51.161.238  .CDMA.           1 u  280 1024  377   26.965    1.401   1.600
*128.4.1.1       .PPS.            1 u  183 1024  377   28.954    2.082   3.973
-204.9.54.119    .CDMA.           1 u 1071 1024  377   43.890    3.573   2.033
-199.102.46.72   .GPS.            1 u  202 1024  377   40.987    7.040   1.583
+24.150.203.150  .PPS.            1 u  207 1024  377   41.696    2.566   1.930
-198.111.152.100 .ACTS.           1 u   71 1024  377   42.992    2.317   1.030
#132.163.4.103   .ACTS.           1 u    6 1024  377   57.985    6.505   0.977
-132.163.4.101   .ACTS.           1 u   68 1024  377   56.978    4.446   4.368
-66.220.9.122    .CDMA.           1 u  106 1024  337   81.961    6.490   2.281
-216.218.254.202 .CDMA.           1 u  219 1024  377   81.633    5.463   1.277
#162.213.2.253   .CDMA.           1 u  228 1024  377   84.996    7.376   1.175
#149.20.64.28    .SHM.            1 u   46 1024  377   85.953    7.483   1.336
-128.9.176.30    .GPS.            1 u   74 1024  377   87.920    5.305   1.271


This is when it is transitioning:

 127.127.1.0     .LOCL.          12 l   5h   64    0    0.000    0.000   0.000
*209.51.161.238  .CDMA.           1 u  410 1024  377   28.959    0.450   2.126
#128.4.1.1       .PPS.            1 u  291 1024  377   26.958    0.129   2.691
-204.9.54.119    .CDMA.           1 u    7 1024  377   44.908    3.528  35.917
-199.102.46.72   .GPS.            1 u  168 1024  377   43.979    6.624   4.318
-24.150.203.150  .PPS.            1 u  254 1024  377   45.753    0.660   2.509
+198.111.152.100 .ACTS.           1 u  179 1024  377   46.001    2.292  20.760
-132.163.4.103   .ACTS.           1 u   99 1024  377   55.985   99.116  94.027
-132.163.4.101   .ACTS.           1 u  124 1024  377   57.896    5.265  35.728
-66.220.9.122    .CDMA.           1 u  209 1024  367   81.891   98.661  93.630
+216.218.254.202 .CDMA.           1 u  320 1024  377   83.847    3.729   3.147
#162.213.2.253   .CDMA.           1 u  316 1024  377   85.218    5.038   2.659
#149.20.64.28    .SHM.            1 u   91 1024  377   86.960  102.099  93.970
-128.9.176.30    .GPS.            1 u  239 1024  377   86.916    4.915   2.382

* it is my understanding that the sudden high offset caused the high jitter - but what about 204.9.54.119?  The jitter seems disproportionate to the offset - (and the offset never spiked or anything like that, I was monitoring it).


last vestiges of sanity:

     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 127.127.1.0     .LOCL.          12 l   6h   64    0    0.000    0.000   0.000
+209.51.161.238  .CDMA.           1 u  467 1024  377   27.964   93.473  92.964
#128.4.1.1       .PPS.            1 u  317 1024  377   28.955   -0.575  36.223
+204.9.54.119    .CDMA.           1 u   58 1024  377   43.883   96.036  85.519
+199.102.46.72   .GPS.            1 u  227 1024  377   39.986   98.083  91.011
+24.150.203.150  .PPS.            1 u  293 1024  377   41.802   93.912  92.710
+198.111.152.100 .ACTS.           1 u  225 1024  377   43.992   93.919  86.175
*132.163.4.103   .ACTS.           1 u  123 1024  377   55.980   97.367  85.272
+132.163.4.101   .ACTS.           1 u  174 1024  377   57.981   99.630  87.805
+66.220.9.122    .CDMA.           1 u  271 1024  357   81.891   98.661  86.907
+216.218.254.202 .CDMA.           1 u  339 1024  377   81.948   97.252  92.453
#162.213.2.253   .CDMA.           1 u  334 1024  377   85.218    5.038  35.977
+149.20.64.28    .SHM.            1 u  108 1024  377   86.960  102.099  86.868
#128.9.176.30    .GPS.            1 u  271 1024  377   86.939    2.408  36.337


Totally insane:

     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 127.127.1.0     .LOCL.          12 l   7h   64    0    0.000    0.000   0.000
#209.51.161.238  .CDMA.           1 u  548 1024  377   27.958   85.080  73.102
+128.4.1.1       .PPS.            1 u  406 1024  377   29.953  181.552 164.327
+204.9.54.119    .CDMA.           1 u  131 1024  377   45.906  183.712 119.918
+199.102.46.72   .GPS.            1 u  327 1024  377   42.987  185.813 136.921
+24.150.203.150  .PPS.            1 u  252 1024  377   42.825  180.684 129.624
+198.111.152.100 .ACTS.           1 u  213 1024  377   46.001  181.784 126.165
#132.163.4.103   .ACTS.           1 u  173 1024  377   54.984   91.792  39.179
*132.163.4.101   .ACTS.           1 u  210 1024  377   56.996  184.380 101.072
+66.220.9.122    .CDMA.           1 u  341 1024  377   80.945  183.981 115.563
#216.218.254.202 .CDMA.           1 u  382 1024  377   81.903   89.652  57.528
+162.213.2.253   .CDMA.           1 u  339 1024  375   85.911  185.879 135.257
+149.20.64.28    .SHM.            1 u  143 1024  377   88.964  187.811 106.702
+128.9.176.30    .GPS.            1 u  323 1024  377   85.885  183.348 133.174


I guess NTP reset?:

     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 127.127.1.0     .LOCL.          12 l   11   64    1    0.000    0.000   0.977
*209.51.161.238  .CDMA.           1 u    1 1024    1   26.955   -4.408   0.998
+128.4.1.1       .PPS.            1 u    1   64    1   27.952   -4.483   1.310
#204.9.54.119    .CDMA.           1 u    -   64    1   44.887   -1.253   7.568
-199.102.46.72   .GPS.            1 u    -   64    1   40.966   -0.618   2.201
+24.150.203.150  .PPS.            1 u    -   64    1   41.797   -3.937   5.489
+198.111.152.100 .ACTS.           1 u    -   64    1   42.990   -2.666   0.977
-132.163.4.103   .ACTS.           1 u    -   64    1   55.983    0.062   1.099
-132.163.4.101   .ACTS.           1 u    -   64    1   56.947   -0.553  11.108
-66.220.9.122    .CDMA.           1 u    1   64    1   80.927   -0.670   1.456
-216.218.254.202 .CDMA.           1 u    -   64    1   80.899   -0.984   1.389
#162.213.2.253   .CDMA.           1 u    1   64    1   83.468    0.806   1.920
#149.20.64.28    .SHM.            1 u    1   64    1   87.942    3.687   1.205
-128.9.176.30    .GPS.            1 u    -   64    1   86.937   -1.184   1.572

I wish I could diagnose my own stuff and not have to pester you with this.  :(
0
Bill
7/25/2016 12:56:51 AM
On 25/07/2016 01:56, Bill Ko wrote:
> Okay, here's some data:
>
> This is when it is functioning normally:
>
>      remote           refid      st t when poll reach   delay   offset  jitter
> ==============================================================================
>  127.127.1.0     .LOCL.          12 l 187m   64    0    0.000    0.000   0.000
> +209.51.161.238  .CDMA.           1 u  280 1024  377   26.965    1.401   1.600
> *128.4.1.1       .PPS.            1 u  183 1024  377   28.954    2.082   3.973
> -204.9.54.119    .CDMA.           1 u 1071 1024  377   43.890    3.573   2.033
> -199.102.46.72   .GPS.            1 u  202 1024  377   40.987    7.040   1.583
> +24.150.203.150  .PPS.            1 u  207 1024  377   41.696    2.566   1.930
> -198.111.152.100 .ACTS.           1 u   71 1024  377   42.992    2.317   1.030
> #132.163.4.103   .ACTS.           1 u    6 1024  377   57.985    6.505   0.977
> -132.163.4.101   .ACTS.           1 u   68 1024  377   56.978    4.446   4.368
> -66.220.9.122    .CDMA.           1 u  106 1024  337   81.961    6.490   2.281
> -216.218.254.202 .CDMA.           1 u  219 1024  377   81.633    5.463   1.277
> #162.213.2.253   .CDMA.           1 u  228 1024  377   84.996    7.376   1.175
> #149.20.64.28    .SHM.            1 u   46 1024  377   85.953    7.483   1.336
> -128.9.176.30    .GPS.            1 u   74 1024  377   87.920    5.305   1.271
>
>
> This is when it is transitioning:
>
>  127.127.1.0     .LOCL.          12 l   5h   64    0    0.000    0.000   0.000
> *209.51.161.238  .CDMA.           1 u  410 1024  377   28.959    0.450   2.126
> #128.4.1.1       .PPS.            1 u  291 1024  377   26.958    0.129   2.691
> -204.9.54.119    .CDMA.           1 u    7 1024  377   44.908    3.528  35.917
> -199.102.46.72   .GPS.            1 u  168 1024  377   43.979    6.624   4.318
> -24.150.203.150  .PPS.            1 u  254 1024  377   45.753    0.660   2.509
> +198.111.152.100 .ACTS.           1 u  179 1024  377   46.001    2.292  20.760
> -132.163.4.103   .ACTS.           1 u   99 1024  377   55.985   99.116  94.027
> -132.163.4.101   .ACTS.           1 u  124 1024  377   57.896    5.265  35.728
> -66.220.9.122    .CDMA.           1 u  209 1024  367   81.891   98.661  93.630
> +216.218.254.202 .CDMA.           1 u  320 1024  377   83.847    3.729   3.147
> #162.213.2.253   .CDMA.           1 u  316 1024  377   85.218    5.038   2.659
> #149.20.64.28    .SHM.            1 u   91 1024  377   86.960  102.099  93.970
> -128.9.176.30    .GPS.            1 u  239 1024  377   86.916    4.915   2.382
>
> * it is my understanding that the sudden high offset caused the high jitter - but what about 204.9.54.119?  The jitter seems disproportionate to the offset - (and the offset never spiked or anything like that, I was monitoring it).
>
>
> last vestiges of sanity:
>
>      remote           refid      st t when poll reach   delay   offset  jitter
> ==============================================================================
>  127.127.1.0     .LOCL.          12 l   6h   64    0    0.000    0.000   0.000
> +209.51.161.238  .CDMA.           1 u  467 1024  377   27.964   93.473  92.964
> #128.4.1.1       .PPS.            1 u  317 1024  377   28.955   -0.575  36.223
> +204.9.54.119    .CDMA.           1 u   58 1024  377   43.883   96.036  85.519
> +199.102.46.72   .GPS.            1 u  227 1024  377   39.986   98.083  91.011
> +24.150.203.150  .PPS.            1 u  293 1024  377   41.802   93.912  92.710
> +198.111.152.100 .ACTS.           1 u  225 1024  377   43.992   93.919  86.175
> *132.163.4.103   .ACTS.           1 u  123 1024  377   55.980   97.367  85.272
> +132.163.4.101   .ACTS.           1 u  174 1024  377   57.981   99.630  87.805
> +66.220.9.122    .CDMA.           1 u  271 1024  357   81.891   98.661  86.907
> +216.218.254.202 .CDMA.           1 u  339 1024  377   81.948   97.252  92.453
> #162.213.2.253   .CDMA.           1 u  334 1024  377   85.218    5.038  35.977
> +149.20.64.28    .SHM.            1 u  108 1024  377   86.960  102.099  86.868
> #128.9.176.30    .GPS.            1 u  271 1024  377   86.939    2.408  36.337
>
>
> Totally insane:
>
>      remote           refid      st t when poll reach   delay   offset  jitter
> ==============================================================================
>  127.127.1.0     .LOCL.          12 l   7h   64    0    0.000    0.000   0.000
> #209.51.161.238  .CDMA.           1 u  548 1024  377   27.958   85.080  73.102
> +128.4.1.1       .PPS.            1 u  406 1024  377   29.953  181.552 164.327
> +204.9.54.119    .CDMA.           1 u  131 1024  377   45.906  183.712 119.918
> +199.102.46.72   .GPS.            1 u  327 1024  377   42.987  185.813 136.921
> +24.150.203.150  .PPS.            1 u  252 1024  377   42.825  180.684 129.624
> +198.111.152.100 .ACTS.           1 u  213 1024  377   46.001  181.784 126.165
> #132.163.4.103   .ACTS.           1 u  173 1024  377   54.984   91.792  39.179
> *132.163.4.101   .ACTS.           1 u  210 1024  377   56.996  184.380 101.072
> +66.220.9.122    .CDMA.           1 u  341 1024  377   80.945  183.981 115.563
> #216.218.254.202 .CDMA.           1 u  382 1024  377   81.903   89.652  57.528
> +162.213.2.253   .CDMA.           1 u  339 1024  375   85.911  185.879 135.257
> +149.20.64.28    .SHM.            1 u  143 1024  377   88.964  187.811 106.702
> +128.9.176.30    .GPS.            1 u  323 1024  377   85.885  183.348 133.174
>
>
> I guess NTP reset?:
>
>      remote           refid      st t when poll reach   delay   offset  jitter
> ==============================================================================
>  127.127.1.0     .LOCL.          12 l   11   64    1    0.000    0.000   0.977
> *209.51.161.238  .CDMA.           1 u    1 1024    1   26.955   -4.408   0.998
> +128.4.1.1       .PPS.            1 u    1   64    1   27.952   -4.483   1.310
> #204.9.54.119    .CDMA.           1 u    -   64    1   44.887   -1.253   7.568
> -199.102.46.72   .GPS.            1 u    -   64    1   40.966   -0.618   2.201
> +24.150.203.150  .PPS.            1 u    -   64    1   41.797   -3.937   5.489
> +198.111.152.100 .ACTS.           1 u    -   64    1   42.990   -2.666   0.977
> -132.163.4.103   .ACTS.           1 u    -   64    1   55.983    0.062   1.099
> -132.163.4.101   .ACTS.           1 u    -   64    1   56.947   -0.553  11.108
> -66.220.9.122    .CDMA.           1 u    1   64    1   80.927   -0.670   1.456
> -216.218.254.202 .CDMA.           1 u    -   64    1   80.899   -0.984   1.389
> #162.213.2.253   .CDMA.           1 u    1   64    1   83.468    0.806   1.920
> #149.20.64.28    .SHM.            1 u    1   64    1   87.942    3.687   1.205
> -128.9.176.30    .GPS.            1 u    -   64    1   86.937   -1.184   1.572
>
> I wish I could diagnose my own stuff and not have to pester you with this.

Bill,

Based on that data, I would guess that the problem is internal to your 
PC rather than anything to do with the network or your external 
connections.  The delay in the external connections doesn't appear to 
change much.  I'm surprised that all your externals are stratum-1 
servers.  These may be overloaded and are /sometimes/ best avoided.

I have a number of suggestions:

- check the PC for any other software which may be resetting the time 
(e.g. is the W32time service disabled?).

- let NTP choose the servers by using the pool command.  This would make 
your servers section something like:

_______________________________________________
# Use pool NTP servers
pool us.pool.ntp.org  maxpoll 6 iburst
_______________________________________________

if you were in the USA, otherwise use your local pool server.  I 
recommend this as NTP /may/ make a better choice than you have, and will 
automatically change servers should one go down.  You may get servers 
nearer to you (in network terms) with less delay.

- drop the Local Clock server unless you really need it.  It allows the 
PC to serve approximate time to others after all the Internet servers 
are lost.  Even without it you can still serve time while you have lock 
and for some time afterwards (if I understand correctly).

Oh, and which version of NTP are you using?

-- 
Cheers,
David
Web: http://www.satsignal.eu
0
David
7/25/2016 7:58:16 AM
Bill Ko <sirstray@gmail.com> wrote:
> On Sunday, July 24, 2016 at 4:39:00 PM UTC-4, Bill Ko wrote:
>> On Sunday, July 24, 2016 at 11:53:44 AM UTC-4, David Taylor wrote:
>> > Bill,
>> > 
>> > I may have missed it, but have you ever shown us the output from an 
>> > "ntpq -pn" command?
>> > 
>> > -- 
>> > Cheers,
>> > David
>> > Web: http://www.satsignal.eu
>> 
>> I'm going to have one from when it is stable, one from when it is transitioning and one when it is bonkers.  Hang on...
>
> Okay, here's some data:
>
> This is when it is functioning normally:
>
>      remote           refid      st t when poll reach   delay   offset  jitter
> ==============================================================================
>  127.127.1.0     .LOCL.          12 l 187m   64    0    0.000    0.000   0.000
> +209.51.161.238  .CDMA.           1 u  280 1024  377   26.965    1.401   1.600
> *128.4.1.1       .PPS.            1 u  183 1024  377   28.954    2.082   3.973
> -204.9.54.119    .CDMA.           1 u 1071 1024  377   43.890    3.573   2.033
> -199.102.46.72   .GPS.            1 u  202 1024  377   40.987    7.040   1.583
> +24.150.203.150  .PPS.            1 u  207 1024  377   41.696    2.566   1.930
> -198.111.152.100 .ACTS.           1 u   71 1024  377   42.992    2.317   1.030
> #132.163.4.103   .ACTS.           1 u    6 1024  377   57.985    6.505   0.977
> -132.163.4.101   .ACTS.           1 u   68 1024  377   56.978    4.446   4.368
> -66.220.9.122    .CDMA.           1 u  106 1024  337   81.961    6.490   2.281
> -216.218.254.202 .CDMA.           1 u  219 1024  377   81.633    5.463   1.277
> #162.213.2.253   .CDMA.           1 u  228 1024  377   84.996    7.376   1.175
> #149.20.64.28    .SHM.            1 u   46 1024  377   85.953    7.483   1.336
> -128.9.176.30    .GPS.            1 u   74 1024  377   87.920    5.305   1.271

Are you REALLY trying to synchronize a Windows machine from 13 external
stratum-1 servers???  Come on...
0
Rob
7/25/2016 8:32:52 AM
David, Jakob,

thanks for the hints. I'm not guy who puts the setup program togetherm,
but I've forwarded the information you gave me.

Martin
-- 
Martin Burnicki

Meinberg Funkuhren
Bad Pyrmont
Germany
0
Martin
7/25/2016 2:29:17 PM
On Monday, July 25, 2016 at 4:32:54 AM UTC-4, Rob wrote:
> Bill Ko <sirstray@gmail.com> wrote:
> > On Sunday, July 24, 2016 at 4:39:00 PM UTC-4, Bill Ko wrote:
> >> On Sunday, July 24, 2016 at 11:53:44 AM UTC-4, David Taylor wrote:
> >> > Bill,
> >> >=20
> >> > I may have missed it, but have you ever shown us the output from an=
=20
> >> > "ntpq -pn" command?
> >> >=20
> >> > --=20
> >> > Cheers,
> >> > David
> >> > Web: http://www.satsignal.eu
> >>=20
> >> I'm going to have one from when it is stable, one from when it is tran=
sitioning and one when it is bonkers.  Hang on...
> >
> > Okay, here's some data:
> >
> > This is when it is functioning normally:
> >
> >      remote           refid      st t when poll reach   delay   offset =
 jitter
> > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D
> >  127.127.1.0     .LOCL.          12 l 187m   64    0    0.000    0.000 =
  0.000
> > +209.51.161.238  .CDMA.           1 u  280 1024  377   26.965    1.401 =
  1.600
> > *128.4.1.1       .PPS.            1 u  183 1024  377   28.954    2.082 =
  3.973
> > -204.9.54.119    .CDMA.           1 u 1071 1024  377   43.890    3.573 =
  2.033
> > -199.102.46.72   .GPS.            1 u  202 1024  377   40.987    7.040 =
  1.583
> > +24.150.203.150  .PPS.            1 u  207 1024  377   41.696    2.566 =
  1.930
> > -198.111.152.100 .ACTS.           1 u   71 1024  377   42.992    2.317 =
  1.030
> > #132.163.4.103   .ACTS.           1 u    6 1024  377   57.985    6.505 =
  0.977
> > -132.163.4.101   .ACTS.           1 u   68 1024  377   56.978    4.446 =
  4.368
> > -66.220.9.122    .CDMA.           1 u  106 1024  337   81.961    6.490 =
  2.281
> > -216.218.254.202 .CDMA.           1 u  219 1024  377   81.633    5.463 =
  1.277
> > #162.213.2.253   .CDMA.           1 u  228 1024  377   84.996    7.376 =
  1.175
> > #149.20.64.28    .SHM.            1 u   46 1024  377   85.953    7.483 =
  1.336
> > -128.9.176.30    .GPS.            1 u   74 1024  377   87.920    5.305 =
  1.271
>=20
> Are you REALLY trying to synchronize a Windows machine from 13 external
> stratum-1 servers???  Come on...

Are you REALLY trying assert your superiority and knowledge over someone wh=
o professed not to be anything close to an expert to begin with?

I admit it's tough not to be sarcastic sometimes when dealing with people w=
ith less knowledge and experience; indeed, on subject matters that I am exp=
ert in, and forums that I advise in, sometimes it's hard for me to reign th=
at attitude in.  But I can (usually) manage it; I suggest you try to do the=
 same.
0
Bill
7/26/2016 3:52:16 AM
On Monday, July 25, 2016 at 4:32:54 AM UTC-4, Rob wrote:
> Bill Ko <sirstray@gmail.com> wrote:
> > On Sunday, July 24, 2016 at 4:39:00 PM UTC-4, Bill Ko wrote:
> >> On Sunday, July 24, 2016 at 11:53:44 AM UTC-4, David Taylor wrote:
> >> > Bill,
> >> >=20
> >> > I may have missed it, but have you ever shown us the output from an=
=20
> >> > "ntpq -pn" command?
> >> >=20
> >> > --=20
> >> > Cheers,
> >> > David
> >> > Web: http://www.satsignal.eu
> >>=20
> >> I'm going to have one from when it is stable, one from when it is tran=
sitioning and one when it is bonkers.  Hang on...
> >
> > Okay, here's some data:
> >
> > This is when it is functioning normally:
> >
> >      remote           refid      st t when poll reach   delay   offset =
 jitter
> > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D
> >  127.127.1.0     .LOCL.          12 l 187m   64    0    0.000    0.000 =
  0.000
> > +209.51.161.238  .CDMA.           1 u  280 1024  377   26.965    1.401 =
  1.600
> > *128.4.1.1       .PPS.            1 u  183 1024  377   28.954    2.082 =
  3.973
> > -204.9.54.119    .CDMA.           1 u 1071 1024  377   43.890    3.573 =
  2.033
> > -199.102.46.72   .GPS.            1 u  202 1024  377   40.987    7.040 =
  1.583
> > +24.150.203.150  .PPS.            1 u  207 1024  377   41.696    2.566 =
  1.930
> > -198.111.152.100 .ACTS.           1 u   71 1024  377   42.992    2.317 =
  1.030
> > #132.163.4.103   .ACTS.           1 u    6 1024  377   57.985    6.505 =
  0.977
> > -132.163.4.101   .ACTS.           1 u   68 1024  377   56.978    4.446 =
  4.368
> > -66.220.9.122    .CDMA.           1 u  106 1024  337   81.961    6.490 =
  2.281
> > -216.218.254.202 .CDMA.           1 u  219 1024  377   81.633    5.463 =
  1.277
> > #162.213.2.253   .CDMA.           1 u  228 1024  377   84.996    7.376 =
  1.175
> > #149.20.64.28    .SHM.            1 u   46 1024  377   85.953    7.483 =
  1.336
> > -128.9.176.30    .GPS.            1 u   74 1024  377   87.920    5.305 =
  1.271
>=20
> Are you REALLY trying to synchronize a Windows machine from 13 external
> stratum-1 servers???  Come on...

I also have to say that this setup has been working for years.  Only recent=
ly it has been giving me issues, and even now, I believe it more to be an i=
ssue with hardware than anything else.

My reasoning behind this was that NTP would only use the servers it felt we=
re needed and the rest would be in reserve - which indeed seemed to be the =
way it operated.  I've experienced issues where a few of the servers became=
 unreachable so others that were barely every considered were pressed into =
service.

But come to think of it, that's what the pool does, doesn't it?
0
Bill
7/26/2016 3:58:41 AM
On Monday, July 25, 2016 at 3:58:18 AM UTC-4, David Taylor wrote:
> On 25/07/2016 01:56, Bill Ko wrote:
> > Okay, here's some data:
> >
> > This is when it is functioning normally:
> >
> >      remote           refid      st t when poll reach   delay   offset  jitter
> > ==============================================================================
> >  127.127.1.0     .LOCL.          12 l 187m   64    0    0.000    0.000   0.000
> > +209.51.161.238  .CDMA.           1 u  280 1024  377   26.965    1.401   1.600
> > *128.4.1.1       .PPS.            1 u  183 1024  377   28.954    2.082   3.973
> > -204.9.54.119    .CDMA.           1 u 1071 1024  377   43.890    3.573   2.033
> > -199.102.46.72   .GPS.            1 u  202 1024  377   40.987    7.040   1.583
> > +24.150.203.150  .PPS.            1 u  207 1024  377   41.696    2.566   1.930
> > -198.111.152.100 .ACTS.           1 u   71 1024  377   42.992    2.317   1.030
> > #132.163.4.103   .ACTS.           1 u    6 1024  377   57.985    6.505   0.977
> > -132.163.4.101   .ACTS.           1 u   68 1024  377   56.978    4.446   4.368
> > -66.220.9.122    .CDMA.           1 u  106 1024  337   81.961    6.490   2.281
> > -216.218.254.202 .CDMA.           1 u  219 1024  377   81.633    5.463   1.277
> > #162.213.2.253   .CDMA.           1 u  228 1024  377   84.996    7.376   1.175
> > #149.20.64.28    .SHM.            1 u   46 1024  377   85.953    7.483   1.336
> > -128.9.176.30    .GPS.            1 u   74 1024  377   87.920    5.305   1.271
> >
> >
> > This is when it is transitioning:
> >
> >  127.127.1.0     .LOCL.          12 l   5h   64    0    0.000    0.000   0.000
> > *209.51.161.238  .CDMA.           1 u  410 1024  377   28.959    0.450   2.126
> > #128.4.1.1       .PPS.            1 u  291 1024  377   26.958    0.129   2.691
> > -204.9.54.119    .CDMA.           1 u    7 1024  377   44.908    3.528  35.917
> > -199.102.46.72   .GPS.            1 u  168 1024  377   43.979    6.624   4.318
> > -24.150.203.150  .PPS.            1 u  254 1024  377   45.753    0.660   2.509
> > +198.111.152.100 .ACTS.           1 u  179 1024  377   46.001    2.292  20.760
> > -132.163.4.103   .ACTS.           1 u   99 1024  377   55.985   99.116  94.027
> > -132.163.4.101   .ACTS.           1 u  124 1024  377   57.896    5.265  35.728
> > -66.220.9.122    .CDMA.           1 u  209 1024  367   81.891   98.661  93.630
> > +216.218.254.202 .CDMA.           1 u  320 1024  377   83.847    3.729   3.147
> > #162.213.2.253   .CDMA.           1 u  316 1024  377   85.218    5.038   2.659
> > #149.20.64.28    .SHM.            1 u   91 1024  377   86.960  102.099  93.970
> > -128.9.176.30    .GPS.            1 u  239 1024  377   86.916    4.915   2.382
> >
> > * it is my understanding that the sudden high offset caused the high jitter - but what about 204.9.54.119?  The jitter seems disproportionate to the offset - (and the offset never spiked or anything like that, I was monitoring it).
> >
> >
> > last vestiges of sanity:
> >
> >      remote           refid      st t when poll reach   delay   offset  jitter
> > ==============================================================================
> >  127.127.1.0     .LOCL.          12 l   6h   64    0    0.000    0.000   0.000
> > +209.51.161.238  .CDMA.           1 u  467 1024  377   27.964   93.473  92.964
> > #128.4.1.1       .PPS.            1 u  317 1024  377   28.955   -0.575  36.223
> > +204.9.54.119    .CDMA.           1 u   58 1024  377   43.883   96.036  85.519
> > +199.102.46.72   .GPS.            1 u  227 1024  377   39.986   98.083  91.011
> > +24.150.203.150  .PPS.            1 u  293 1024  377   41.802   93.912  92.710
> > +198.111.152.100 .ACTS.           1 u  225 1024  377   43.992   93.919  86.175
> > *132.163.4.103   .ACTS.           1 u  123 1024  377   55.980   97.367  85.272
> > +132.163.4.101   .ACTS.           1 u  174 1024  377   57.981   99.630  87.805
> > +66.220.9.122    .CDMA.           1 u  271 1024  357   81.891   98.661  86.907
> > +216.218.254.202 .CDMA.           1 u  339 1024  377   81.948   97.252  92.453
> > #162.213.2.253   .CDMA.           1 u  334 1024  377   85.218    5.038  35.977
> > +149.20.64.28    .SHM.            1 u  108 1024  377   86.960  102.099  86.868
> > #128.9.176.30    .GPS.            1 u  271 1024  377   86.939    2.408  36.337
> >
> >
> > Totally insane:
> >
> >      remote           refid      st t when poll reach   delay   offset  jitter
> > ==============================================================================
> >  127.127.1.0     .LOCL.          12 l   7h   64    0    0.000    0.000   0.000
> > #209.51.161.238  .CDMA.           1 u  548 1024  377   27.958   85.080  73.102
> > +128.4.1.1       .PPS.            1 u  406 1024  377   29.953  181.552 164.327
> > +204.9.54.119    .CDMA.           1 u  131 1024  377   45.906  183.712 119.918
> > +199.102.46.72   .GPS.            1 u  327 1024  377   42.987  185.813 136.921
> > +24.150.203.150  .PPS.            1 u  252 1024  377   42.825  180.684 129.624
> > +198.111.152.100 .ACTS.           1 u  213 1024  377   46.001  181.784 126.165
> > #132.163.4.103   .ACTS.           1 u  173 1024  377   54.984   91.792  39.179
> > *132.163.4.101   .ACTS.           1 u  210 1024  377   56.996  184.380 101.072
> > +66.220.9.122    .CDMA.           1 u  341 1024  377   80.945  183.981 115.563
> > #216.218.254.202 .CDMA.           1 u  382 1024  377   81.903   89.652  57.528
> > +162.213.2.253   .CDMA.           1 u  339 1024  375   85.911  185.879 135.257
> > +149.20.64.28    .SHM.            1 u  143 1024  377   88.964  187.811 106.702
> > +128.9.176.30    .GPS.            1 u  323 1024  377   85.885  183.348 133.174
> >
> >
> > I guess NTP reset?:
> >
> >      remote           refid      st t when poll reach   delay   offset  jitter
> > ==============================================================================
> >  127.127.1.0     .LOCL.          12 l   11   64    1    0.000    0.000   0.977
> > *209.51.161.238  .CDMA.           1 u    1 1024    1   26.955   -4.408   0.998
> > +128.4.1.1       .PPS.            1 u    1   64    1   27.952   -4.483   1.310
> > #204.9.54.119    .CDMA.           1 u    -   64    1   44.887   -1.253   7.568
> > -199.102.46.72   .GPS.            1 u    -   64    1   40.966   -0.618   2.201
> > +24.150.203.150  .PPS.            1 u    -   64    1   41.797   -3.937   5.489
> > +198.111.152.100 .ACTS.           1 u    -   64    1   42.990   -2.666   0.977
> > -132.163.4.103   .ACTS.           1 u    -   64    1   55.983    0.062   1.099
> > -132.163.4.101   .ACTS.           1 u    -   64    1   56.947   -0.553  11.108
> > -66.220.9.122    .CDMA.           1 u    1   64    1   80.927   -0.670   1.456
> > -216.218.254.202 .CDMA.           1 u    -   64    1   80.899   -0.984   1.389
> > #162.213.2.253   .CDMA.           1 u    1   64    1   83.468    0.806   1.920
> > #149.20.64.28    .SHM.            1 u    1   64    1   87.942    3.687   1.205
> > -128.9.176.30    .GPS.            1 u    -   64    1   86.937   -1.184   1.572
> >
> > I wish I could diagnose my own stuff and not have to pester you with this.
> 
> Bill,
> 
> Based on that data, I would guess that the problem is internal to your 
> PC rather than anything to do with the network or your external 
> connections.  The delay in the external connections doesn't appear to 
> change much.  I'm surprised that all your externals are stratum-1 
> servers.  These may be overloaded and are /sometimes/ best avoided.
> 
> I have a number of suggestions:
> 
> - check the PC for any other software which may be resetting the time 
> (e.g. is the W32time service disabled?).
> 
> - let NTP choose the servers by using the pool command.  This would make 
> your servers section something like:
> 
> _______________________________________________
> # Use pool NTP servers
> pool us.pool.ntp.org  maxpoll 6 iburst
> _______________________________________________
> 
> if you were in the USA, otherwise use your local pool server.  I 
> recommend this as NTP /may/ make a better choice than you have, and will 
> automatically change servers should one go down.  You may get servers 
> nearer to you (in network terms) with less delay.
> 
> - drop the Local Clock server unless you really need it.  It allows the 
> PC to serve approximate time to others after all the Internet servers 
> are lost.  Even without it you can still serve time while you have lock 
> and for some time afterwards (if I understand correctly).
> 
> Oh, and which version of NTP are you using?
> 
> -- 
> Cheers,
> David
> Web: http://www.satsignal.eu

Hi, David:

I'm using ntp version 4.2.8p8.

Maybe I should look over the latest Windows update to see if there was anything that might have affected clock accuracy or access.  (Maybe Windows went and re-enabled W32time, for instance?)

Thanks for your advice.  Now I have more stuff to try, as well as a direction to go in.  :)
0
Bill
7/26/2016 4:05:24 AM
On 26/07/2016 05:05, Bill Ko wrote:
[]
> Hi, David:
>
> I'm using ntp version 4.2.8p8.
>
> Maybe I should look over the latest Windows update to see if there was anything that might have affected clock accuracy or access.  (Maybe Windows went and re-enabled W32time, for instance?)
>
> Thanks for your advice.  Now I have more stuff to try, as well as a direction to go in.  :)

Bill, that version of NTP is fine.  No problems.  On the several Win-10 
PCs here I've never seen the W32time service be re-enabled, but who knows!

Thinking further along PC losses, I'm thinking that /something/ may be 
interrupting operation for a fraction of a second.  Is the step always 
in the same direction?

One piece of software I've seen cause problems in the past is when the 
multi-media timer is enabled.  That can cause steps.  We added code some 
time back to allow NTP to enable the MMtimer itself so that other 
programs switching the MMtimer on or off didn't affect NTP - it always 
ran with the MMtimer on.  I forget now whether that's the default, or 
whether you need to enable that option.  It was several years ago. 
Google mentions the "-M" switch, and I see that in my own command-line 
in the Services control-panel:

   C:\Tools\NTP\bin\ntpd.exe -U 3 -M -g -c "C:\Tools\NTP\etc\ntp.conf"

Not sure why I have  -U 3  present, though.  I believe that's from the 
Meinberg install.

Please let us know how you get on.

-- 
Cheers,
David
Web: http://www.satsignal.eu
0
David
7/26/2016 9:32:59 AM
David Taylor wrote:
> On 26/07/2016 05:05, Bill Ko wrote:
> []
>> Hi, David:
>>
>> I'm using ntp version 4.2.8p8.
>>
>> Maybe I should look over the latest Windows update to see if there was
>> anything that might have affected clock accuracy or access.  (Maybe
>> Windows went and re-enabled W32time, for instance?)
>>
>> Thanks for your advice.  Now I have more stuff to try, as well as a
>> direction to go in.  :)
> 
> Bill, that version of NTP is fine.  No problems.  On the several Win-10
> PCs here I've never seen the W32time service be re-enabled, but who knows!
> 
> Thinking further along PC losses, I'm thinking that /something/ may be
> interrupting operation for a fraction of a second.  Is the step always
> in the same direction?
> 
> One piece of software I've seen cause problems in the past is when the
> multi-media timer is enabled.  That can cause steps.  We added code some
> time back to allow NTP to enable the MMtimer itself so that other
> programs switching the MMtimer on or off didn't affect NTP - it always
> ran with the MMtimer on.  I forget now whether that's the default, or
> whether you need to enable that option.  It was several years ago.
> Google mentions the "-M" switch, and I see that in my own command-line
> in the Services control-panel:
> 
>   C:\Tools\NTP\bin\ntpd.exe -U 3 -M -g -c "C:\Tools\NTP\etc\ntp.conf"

Yes, -M lets ntpd set the MM timer to highest resolution when the
service starts. This setting is included/enabled by default by the setup
program.

> Not sure why I have  -U 3  present, though.  I believe that's from the
> Meinberg install.

"-U 3" asks ntpd to update the interface list one every 3 seconds.

This is useful for example with laptops which don't have a Wifi/LAN
connection established when ntpd starts after boot, so ntpd is unable to
poll any upstream servers.

The original implementation had a kind of "backout", which means if
upstream servers can't be reached ntpd doubles the time interval until
next retry.

This means if you boot your laptop and connect to a Wifi network e.g. 1
hour later then ntpd may already take quite a long time until it even
retries to reach its upstream servers.

With "-U 3" ntpd becomes quickly aware that a new interface is enabled
and retries immediately. So initial synchronization is much faster. This
isn't necessary in normal computers, but it doesn't hurt there, either.

Martin
-- 
Martin Burnicki

Meinberg Funkuhren
Bad Pyrmont
Germany
0
Martin
7/26/2016 12:53:13 PM
Bill Ko wrote:
> On Monday, July 25, 2016 at 3:58:18 AM UTC-4, David Taylor wrote:
>> On 25/07/2016 01:56, Bill Ko wrote:
>>> Okay, here's some data:
[...]

What about the suggestion I made some days ago to have a look at the
Windows application event log, and let ntpd write statistics files?

https://groups.google.com/d/msg/comp.protocols.time.ntp/R4yzNUyKHqg/658KXKDCCAAJ


Martin
-- 
Martin Burnicki

Meinberg Funkuhren
Bad Pyrmont
Germany
0
Martin
7/26/2016 12:57:33 PM
On 26/07/2016 13:53, Martin Burnicki wrote:
[]
> Yes, -M lets ntpd set the MM timer to highest resolution when the
> service starts. This setting is included/enabled by default by the setup
> program.
>
>> Not sure why I have  -U 3  present, though.  I believe that's from the
>> Meinberg install.
>
> "-U 3" asks ntpd to update the interface list one every 3 seconds.
>
> This is useful for example with laptops which don't have a Wifi/LAN
> connection established when ntpd starts after boot, so ntpd is unable to
> poll any upstream servers.
>
> The original implementation had a kind of "backout", which means if
> upstream servers can't be reached ntpd doubles the time interval until
> next retry.
>
> This means if you boot your laptop and connect to a Wifi network e.g. 1
> hour later then ntpd may already take quite a long time until it even
> retries to reach its upstream servers.
>
> With "-U 3" ntpd becomes quickly aware that a new interface is enabled
> and retries immediately. So initial synchronization is much faster. This
> isn't necessary in normal computers, but it doesn't hurt there, either.
>
> Martin

Thanks for that, Martin.  I think I will add that note to my Web page so 
that I'll remember for next time!  Will help others, too.

-- 
Cheers,
David
Web: http://www.satsignal.eu
0
David
7/26/2016 3:10:14 PM
On 2016-07-26, Bill Ko <sirstray@gmail.com> wrote:
> On Monday, July 25, 2016 at 4:32:54 AM UTC-4, Rob wrote:
>> Bill Ko <sirstray@gmail.com> wrote:
>> > On Sunday, July 24, 2016 at 4:39:00 PM UTC-4, Bill Ko wrote:
>> >> On Sunday, July 24, 2016 at 11:53:44 AM UTC-4, David Taylor wrote:
>> >> > Bill,
>> >> > 
>> >> > I may have missed it, but have you ever shown us the output from an 
>> >> > "ntpq -pn" command?
>> >> > 
>> >> > -- 
>> >> > Cheers,
>> >> > David
>> >> > Web: http://www.satsignal.eu
>> >> 
>> >> I'm going to have one from when it is stable, one from when it is transitioning and one when it is bonkers.  Hang on...
>> >
>> > Okay, here's some data:
>> >
>> > This is when it is functioning normally:
>> >
>> >      remote           refid      st t when poll reach   delay   offset  jitter
>> > ==============================================================================
>> >  127.127.1.0     .LOCL.          12 l 187m   64    0    0.000    0.000   0.000
>> > +209.51.161.238  .CDMA.           1 u  280 1024  377   26.965    1.401   1.600
>> > *128.4.1.1       .PPS.            1 u  183 1024  377   28.954    2.082   3.973
>> > -204.9.54.119    .CDMA.           1 u 1071 1024  377   43.890    3.573   2.033
>> > -199.102.46.72   .GPS.            1 u  202 1024  377   40.987    7.040   1.583
>> > +24.150.203.150  .PPS.            1 u  207 1024  377   41.696    2.566   1.930
>> > -198.111.152.100 .ACTS.           1 u   71 1024  377   42.992    2.317   1.030
>> > #132.163.4.103   .ACTS.           1 u    6 1024  377   57.985    6.505   0.977
>> > -132.163.4.101   .ACTS.           1 u   68 1024  377   56.978    4.446   4.368
>> > -66.220.9.122    .CDMA.           1 u  106 1024  337   81.961    6.490   2.281
>> > -216.218.254.202 .CDMA.           1 u  219 1024  377   81.633    5.463   1.277
>> > #162.213.2.253   .CDMA.           1 u  228 1024  377   84.996    7.376   1.175
>> > #149.20.64.28    .SHM.            1 u   46 1024  377   85.953    7.483   1.336
>> > -128.9.176.30    .GPS.            1 u   74 1024  377   87.920    5.305   1.271
>> 
>> Are you REALLY trying to synchronize a Windows machine from 13 external
>> stratum-1 servers???  Come on...
>
> I also have to say that this setup has been working for years.  Only recently it has been giving me issues, and even now, I believe it more to be an issue with hardware than anything else.
>
> My reasoning behind this was that NTP would only use the servers it felt were needed and the rest would be in reserve - which indeed seemed to be the way it operated.  I've experienced issues where a few of the servers became unreachable so others that were barely every considered were pressed into service.

Note that all of the sources are being constantly polled (reach 377) ntp
is using all of the servers all of the time. It is excluding some from
the averaging for one reason or another, but all are being polled.

>
> But come to think of it, that's what the pool does, doesn't it?

No, pool queries only a few servers ( unless you have masses of entries
for pool servers) and if one goes down only then does it look for
another one. 

The comment made was that you are "tying up" a bunch of stratum 1
servers, when your system is incapable of actually using the accuracy
given by even one of them

0
William
7/26/2016 6:06:13 PM
William Unruh <unruh@invalid.ca> wrote:
> The comment made was that you are "tying up" a bunch of stratum 1
> servers, when your system is incapable of actually using the accuracy
> given by even one of them

That is right.  To sync a Windows machine, use a couple of pool servers.
To sync any hobby machine use 3-5 servers.

To make synchronization to so many stratum-1 servers worthwile, at least
install an OS that can do accurate timekeeping.  And even then, don't
use 13 stratum-1 servers unless they are your own.

(I operate a Linux server that is synchronized to 8 stratum-1 servers,
but they are all own servers and most of the reason they are all used
as reference is to monitor that they all are serving correct time.  it
tends to remain within 0.1ms offset from all servers)
0
Rob
7/26/2016 6:27:43 PM
On Tuesday, July 26, 2016 at 2:27:45 PM UTC-4, Rob wrote:
> William Unruh <unruh@invalid.ca> wrote:
> > The comment made was that you are "tying up" a bunch of stratum 1
> > servers, when your system is incapable of actually using the accuracy
> > given by even one of them
> 
> That is right.  To sync a Windows machine, use a couple of pool servers.
> To sync any hobby machine use 3-5 servers.
> 
> To make synchronization to so many stratum-1 servers worthwile, at least
> install an OS that can do accurate timekeeping.  And even then, don't
> use 13 stratum-1 servers unless they are your own.
> 
> (I operate a Linux server that is synchronized to 8 stratum-1 servers,
> but they are all own servers and most of the reason they are all used
> as reference is to monitor that they all are serving correct time.  it
> tends to remain within 0.1ms offset from all servers)

You're right.  It's like using calipers to scribe a 1/2 cut - and I'm using a hand saw to make the cut.  Just use a friggin tape measure, for cryin out loud.
0
Bill
7/27/2016 11:13:58 PM
On Wednesday, July 20, 2016 at 1:36:51 AM UTC-4, Bill Ko wrote:
> Dear group:
>=20
> I have devoted a PC to be a time server for my little home network.  It d=
idn't need to be really accurate; I maintain it mostly just for fun.
>=20
> Anyway, this server, while not particularly accurate (about 35 ppm), has =
nevertheless been extremely reliable, and the network reliable enough to gi=
ve me single-digit jitter with very consistent offsets from the five intern=
et sources I sync to.
>=20
> Suddenly I've been experiencing weird jaunts where the offsets jump by 10=
0 ms or more and stay there for a bit, before finally drifting to levels th=
ey were at when it was healthy.  Then after a few minutes, or a few hours -=
 very irregular intervals - it starts all over again.  It is always heralde=
d by jitter suddenly skyrocketing, even though the offsets are rock-steady.=
 Then within the next round or so of queries, the offsets suddenly jump - a=
ll of them!
> What do you think might be the cause?
>=20
> Any comments, ideas and/or advice is welcome.  Thanks for taking the time=
 to reply!
>=20
> Bill

Just a followup.  I'm pretty convinced it's a hardware issue now.  My compu=
ter has started crashing with stop code 0x00000101 "A clock interrupt was n=
ot received on a secondary processor within the allocated time interval".

It seems that this code is usually associated with overclocking, but this i=
s a stock a setup as you can get.  I'll have to troubleshoot this before I =
can even hope to continue here.

It's just weird that it would rear its ugly head after having gone into hib=
ernation for a few hours.

Thanks everyone - I'll take the advice given here to make new choices when/=
if I'll configure another time server.  And I'll pay attention to this grou=
p now that I know of its existence!
0
Bill
7/27/2016 11:38:47 PM
Bill Ko <sirstray@gmail.com> wrote:
> Just a followup.  I'm pretty convinced it's a hardware issue now.  My computer has started crashing with stop code 0x00000101 "A clock interrupt was not received on a secondary processor within the allocated time interval".
>
> It seems that this code is usually associated with overclocking, but this is a stock a setup as you can get.  I'll have to troubleshoot this before I can even hope to continue here.
>
> It's just weird that it would rear its ugly head after having gone into hibernation for a few hours.
>
> Thanks everyone - I'll take the advice given here to make new choices when/if I'll configure another time server.  And I'll pay attention to this group now that I know of its existence!

Maybe your computer is slowly dying, e.g. from bad capacitors.
Keep it switched off for a night, and make sure it cools down well.
When it does not boot immediately the next day (but maybe still boots
after a minute of "warm up"), you know this is the reason.

Get a Raspberry Pi.  It is much better as a timeserver and it consumes
way less power.
0
Rob
7/28/2016 8:47:20 AM
On 28/07/2016 09:47, Rob wrote:
[]
> Get a Raspberry Pi.  It is much better as a timeserver and it consumes
> way less power.

Some notes here:
   http://www.satsignal.eu/ntp/Raspberry-Pi-quickstart.html

-- 
Cheers,
David
Web: http://www.satsignal.eu
0
David
7/28/2016 10:48:41 AM
On 2016-07-28, Rob <nomail@example.com> wrote:
> Get a Raspberry Pi.  It is much better as a timeserver and it consumes
> way less power.

Or any router supported by OpenWrt. It might be cheaper, consume even
less power and be a better server as the NIC is not on USB.

-- 
Miroslav Lichvar
0
Miroslav
7/28/2016 12:36:26 PM
On 28/07/2016 13:36, Miroslav Lichvar wrote:
> On 2016-07-28, Rob <nomail@example.com> wrote:
>> Get a Raspberry Pi.  It is much better as a timeserver and it consumes
>> way less power.
>
> Or any router supported by OpenWrt. It might be cheaper, consume even
> less power and be a better server as the NIC is not on USB.

I have such a router, but I've not seen any support for GPS/PPS.  Has 
anyone compared OpenWRT with an Internet sync source versus a Raspberry 
Pi (or similar) with a local GPS/PPS sync?  I suspect that with my 
quality of connection the local GPS/PPS server would win.

-- 
Cheers,
David
Web: http://www.satsignal.eu
0
David
7/28/2016 4:48:56 PM
On 2016-07-28, David Taylor <david-taylor@blueyonder.co.uk.invalid> wrote:
> On 28/07/2016 13:36, Miroslav Lichvar wrote:
>> Or any router supported by OpenWrt. It might be cheaper, consume even
>> less power and be a better server as the NIC is not on USB.
>
> I have such a router, but I've not seen any support for GPS/PPS.  Has 
> anyone compared OpenWRT with an Internet sync source versus a Raspberry 
> Pi (or similar) with a local GPS/PPS sync?  I suspect that with my 
> quality of connection the local GPS/PPS server would win.

There is a general support for PPS in OpenWrt, it doesn't necessarily
have to be a stratum >= 2 server. The pps-gpio kernel module is packaged
and both chronyd and ntpd are compiled with PPS support (in the
development branch). The question is whether the GPIO driver will work
on your router. You many need to patch and recompile the kernel, or use
a polling GPIO driver instead.

Here are some links you may find useful:
http://pjrlost.blogspot.cz/2015/09/ar9331-pps-gps-ntp-awesome.html
https://code.google.com/p/openwrt-stratum1/

-- 
Miroslav Lichvar
0
Miroslav
7/29/2016 6:43:57 AM
On 29/07/2016 08:43, Miroslav Lichvar wrote:
> On 2016-07-28, David Taylor <david-taylor@blueyonder.co.uk.invalid> wrote:
>> On 28/07/2016 13:36, Miroslav Lichvar wrote:
>>> Or any router supported by OpenWrt. It might be cheaper, consume even
>>> less power and be a better server as the NIC is not on USB.
>>
>> I have such a router, but I've not seen any support for GPS/PPS.  Has
>> anyone compared OpenWRT with an Internet sync source versus a Raspberry
>> Pi (or similar) with a local GPS/PPS sync?  I suspect that with my
>> quality of connection the local GPS/PPS server would win.
>
> There is a general support for PPS in OpenWrt, it doesn't necessarily
> have to be a stratum >= 2 server. The pps-gpio kernel module is packaged
> and both chronyd and ntpd are compiled with PPS support (in the
> development branch). The question is whether the GPIO driver will work
> on your router. You many need to patch and recompile the kernel, or use
> a polling GPIO driver instead.
>

Real issue is to actually find an available GPIO pin on a router to
connect to your GPS receiver.  Most routers already use their GPIO pins
to handle the LEDs and pushbuttons on the device, for which OpenWrt
tends to have working GPIO drivers and configuration files.

> Here are some links you may find useful:
> http://pjrlost.blogspot.cz/2015/09/ar9331-pps-gps-ntp-awesome.html
> https://code.google.com/p/openwrt-stratum1/
>


Enjoy

Jakob
-- 
Jakob Bohm, CIO, Partner, WiseMo A/S.  https://www.wisemo.com
Transformervej 29, 2860 S´┐Żborg, Denmark.  Direct +45 31 13 16 10
This public discussion message is non-binding and may contain errors.
WiseMo - Remote Service Management for PCs, Phones and Embedded
0
Jakob
7/29/2016 8:22:22 AM
On 29/07/2016 07:43, Miroslav Lichvar wrote:
[]
> There is a general support for PPS in OpenWrt, it doesn't necessarily
> have to be a stratum >= 2 server. The pps-gpio kernel module is packaged
> and both chronyd and ntpd are compiled with PPS support (in the
> development branch). The question is whether the GPIO driver will work
> on your router. You many need to patch and recompile the kernel, or use
> a polling GPIO driver instead.
>
> Here are some links you may find useful:
> http://pjrlost.blogspot.cz/2015/09/ar9331-pps-gps-ntp-awesome.html
> https://code.google.com/p/openwrt-stratum1/

Thanks, Miroslav.

If I had somewhere to connect a PPS signal and somewhere to connect the 
GPS serial signal, and if I could face the drudgery of cross-recompiling 
a kernel, and if I have the time, perhaps I would consider this.  Right 
now, either a Raspberry Pi card or LeoNTP box is much more appealing!

-- 
Cheers,
David
Web: http://www.satsignal.eu
0
David
7/29/2016 4:57:49 PM
On Thursday, July 28, 2016 at 4:47:21 AM UTC-4, Rob wrote:
> Bill Ko <sirstray@gmail.com> wrote:
> > Just a followup.  I'm pretty convinced it's a hardware issue now.  My c=
omputer has started crashing with stop code 0x00000101 "A clock interrupt w=
as not received on a secondary processor within the allocated time interval=
".
> >
> > It seems that this code is usually associated with overclocking, but th=
is is a stock a setup as you can get.  I'll have to troubleshoot this befor=
e I can even hope to continue here.
> >
> > It's just weird that it would rear its ugly head after having gone into=
 hibernation for a few hours.
> >
> > Thanks everyone - I'll take the advice given here to make new choices w=
hen/if I'll configure another time server.  And I'll pay attention to this =
group now that I know of its existence!
>=20
> Maybe your computer is slowly dying, e.g. from bad capacitors.
> Keep it switched off for a night, and make sure it cools down well.
> When it does not boot immediately the next day (but maybe still boots
> after a minute of "warm up"), you know this is the reason.
>=20
> Get a Raspberry Pi.  It is much better as a timeserver and it consumes
> way less power.

I, too, thought it could be something of that nature - it is an older DELL =
T5500 - so I just replicated the time server on another computer, using the=
 same settings and config file.  This computer featured an AMD CPU instead.

Interestingly enough, it showed the same effect - although much less promin=
ent.  I suspect this is because the motherboard clock drift numbers are hal=
ved compared to the original computer.

I already ordered a Raspberry Pi.  In addition to using it as a time server=
, I can practice with LabVIEW to make it do all kinds of stupid computer tr=
icks.  ;)

So I am left to wonder what is really happening here.  The question has bee=
n downgraded to "curiosity" status since I am moving the whole timeserver t=
o the Raspberry PI when it shows up, but I still wonder.  Maybe I should go=
 look through all the Windows updates that occurred on the same day that th=
e power went down and the UPS put it to sleep to see if I can spot any like=
ly suspects.

And, rest assured, when the time server goes back up, it will be configured=
 based on the new knowledge that everyone has imparted here in the group.
0
Bill
7/31/2016 1:43:46 AM
Reply: