Folks,
[Posted to NTP Hackers, but no reaction there as yet]
I've written a small program which sends some SNTP packets to various NTP
servers on my LAN, and looks at the timestamps which the server adds as
its RX and TX times. With Windows-7 and Vista I'm seeing some odd
results. I'm measuring (server TX) - (server RX) time.
1 - Bacchus - Windows 2000 Server - GPS/PPS ref. clock - older computer -
550MHz Pentium III. Most values around 70 us (microseconds) with a tail
up to about 120us.
2 - Feenix - Windows XP Home - GPS/PPS ref. clock - 1.9GHz single core
Pentium 4. Most values around 20-40us, tail to 100us. Occasional values
out to 1000us or more.
3 - Narvik - Windows XP Pro - LAN-synced - ~2.2GHz dual-core PC. Most
values 6 - 11us. Occasionally more.
4 - Gemini - Windows Vista - LAN synced - ~2.2GHz dual-core PC. A
distribution ranging from about -1000us to +1000us, possibly triangular
(I'm looking on a log Y-axis).
5 - Puffin - Windows Vista - wifi-LAN-synced - ~2+ GHz dual-core PC.
Similar results to Gemini.
6 - Stamsund - Windows 7 - GPS/PPS ref. clock - 2.8GHz single core HT
Pentium 4. Most results in the range 17-25us, but with some extremes.
7 - Hydra - Windows 7 - LAN synced - single-core AMD 3200+. Similar
distribution to Gemini.
So there are two things about these results which concern me:
A - if my program is correct, it seems that the timestamps on the
Windows-7/Vista systems are being set inconsistently, in that the server
transmit timestamp can be /before/ the receive timestamp! From the
distribution it seems that one timestamp is "precise" and the other a
Windows value based on a one millisecond (approx) timer.
B - Why does the Windows-7 system with a GPS/PPS reference clock not
behave in the same way i.e. it doesn't give negative (TX-RX) times?
What I don't know is whether these results are to be expected, whether
they may have any effect on the operation of NTP, and whether they might
even be the results of coding errors. I'm wondering whether this
indicates that something could be done to improve NTP on Windows-7/Vista,
and whether it might even provide a further clue as to why 4.2.5 performs
worse on Windows-7/Vista than on 2000/XP.
Thanks,
David
|
|
0
|
|
|
|
Reply
|
David
|
12/11/2009 8:23:00 AM |
|
David,
David J Taylor wrote:
> Folks,
>
> [Posted to NTP Hackers, but no reaction there as yet]
>
> I've written a small program which sends some SNTP packets to various NTP
> servers on my LAN, and looks at the timestamps which the server adds as
> its RX and TX times. With Windows-7 and Vista I'm seeing some odd
> results. I'm measuring (server TX) - (server RX) time.
>
> 1 - Bacchus - Windows 2000 Server - GPS/PPS ref. clock - older computer -
> 550MHz Pentium III. Most values around 70 us (microseconds) with a tail
> up to about 120us.
This looks like the clock interpolation works pretty good here. 70us sound
like the "normal" execution time to handle the packet, which may be
extended to e.g 120 us (or even more) if e.g. an IRQ occurs during the
processing.
> 2 - Feenix - Windows XP Home - GPS/PPS ref. clock - 1.9GHz single core
> Pentium 4. Most values around 20-40us, tail to 100us. Occasional values
> out to 1000us or more.
>
> 3 - Narvik - Windows XP Pro - LAN-synced - ~2.2GHz dual-core PC. Most
> values 6 - 11us. Occasionally more.
Similar as above. Please keep in mind timestamping is in user space here, so
there may not only be IRQs but also task switches etc. which expand the
time between packet reception and transmission of a reply.
> 4 - Gemini - Windows Vista - LAN synced - ~2.2GHz dual-core PC. A
> distribution ranging from about -1000us to +1000us, possibly triangular
> (I'm looking on a log Y-axis).
>
> 5 - Puffin - Windows Vista - wifi-LAN-synced - ~2+ GHz dual-core PC.
> Similar results to Gemini.
>
> 6 - Stamsund - Windows 7 - GPS/PPS ref. clock - 2.8GHz single core HT
> Pentium 4. Most results in the range 17-25us, but with some extremes.
>
> 7 - Hydra - Windows 7 - LAN synced - single-core AMD 3200+. Similar
> distribution to Gemini.
For Gemini, Puffin, and Hydra:
If you are running one of Dave Hart's 4.2.5 or 4.2.6 binaries then the clock
interpolation may be disabled, and the system time increases in 1 ms steps.
On the other hand, I've also seen systems where the interpolated time steps
back and forth by 1 ms, due to the time passed by Windows to the timer APC
callback steppingby 1 ms. I have *observed* this on XP, but I can imagine
this also happens on newer systems.
Please note:
Even if under Vista/Windows 7 the system time increments in 1 ms steps, the
nominal standard tick count is still ~15600 (15601 on a Vista machine
here), i.e ~15.6 ms. Since this is not an integral multiple of 1 ms there
must be some math which converts from 1 ms steps to 15.6 ms steps, and that
math may suffer from rounding errors.
AFAICS this is still the basic problem as under XP or earlier, when the MM
timer has been set: The MM timer ticks at 1 ms, but the system time ticks
at 15.625 ms, and there also needs to be a conversion from one tick rate to
the other.
The difference in Vista/7 vs. 2000/XP seems to be that
GetSystemTimeAsFiletime returns values from the 1 ms "tick domain" for the
newer systems whereas it returns values from the 15 ms "tick domain" on
older systems.
> So there are two things about these results which concern me:
>
> A - if my program is correct, it seems that the timestamps on the
> Windows-7/Vista systems are being set inconsistently, in that the server
> transmit timestamp can be /before/ the receive timestamp! From the
> distribution it seems that one timestamp is "precise" and the other a
> Windows value based on a one millisecond (approx) timer.
As I tried to explain above, this looks to me like a +/- 1 LSB (i.e. 1ms)
problem. IIRC then Dave Hart has implemented some code in the clock
interpolation routine which shall reduce the potential +/- 1 ms jitter in
general. However, I'm not sure whether this routine is in effect if clock
interpolation is disabled, eg. on Vista/7.
> B - Why does the Windows-7 system with a GPS/PPS reference clock not
> behave in the same way i.e. it doesn't give negative (TX-RX) times?
Concerning the 1ms-to-15.6ms conversion mentioned above:
A *possible* reason I can imagine is that this depends on whether the clock
runs too fast or too slow at its nominal tick rate (i.e. the on-board xtal
is below or above its nominal frequency). In one case the frequency drift
compensation has to *add* an offset to the standard tick rate, in the other
case an offset needs to be subtracted. Depending on the way how the
conversion has been implemented in the Windows kernel, a positive offset
may lead to rounding errors whereas a negative one may not, or vice-versa.
All the above are only assumptions.
> What I don't know is whether these results are to be expected, whether
> they may have any effect on the operation of NTP, and whether they might
> even be the results of coding errors. I'm wondering whether this
> indicates that something could be done to improve NTP on Windows-7/Vista,
> and whether it might even provide a further clue as to why 4.2.5 performs
> worse on Windows-7/Vista than on 2000/XP.
IMO it will be very hard to improve things for NTP if you do not know the
exact details why this happens. The proper solution would be if the MS
developers cared about the clock interpolation, and made the Windows system
time available at a higher resolution, especially since the available API
calls already support higher resolution.
Those guys know how the Windows timekeeping has been coded, they know when
CPU clock rates are switche to save power, etc., and they could handle this
in kernel space. So timekeeping apps like NTP would not need to care about
limitations of the underlying OS.
Martin
--
Martin Burnicki
Meinberg Funkuhren
Bad Pyrmont
Germany
|
|
0
|
|
|
|
Reply
|
Martin
|
12/11/2009 1:33:16 PM
|
|
"Martin Burnicki" <martin.burnicki@meinberg.de> wrote in message
news:s9u9v6-gj3.ln1@gateway.py.meinberg.de...
> David,
>
> David J Taylor wrote:
>> Folks,
>>
>> [Posted to NTP Hackers, but no reaction there as yet]
>>
>> I've written a small program which sends some SNTP packets to various
>> NTP
>> servers on my LAN, and looks at the timestamps which the server adds as
>> its RX and TX times. With Windows-7 and Vista I'm seeing some odd
>> results. I'm measuring (server TX) - (server RX) time.
>>
>> 1 - Bacchus - Windows 2000 Server - GPS/PPS ref. clock - older
>> computer -
>> 550MHz Pentium III. Most values around 70 us (microseconds) with a
>> tail
>> up to about 120us.
>
> This looks like the clock interpolation works pretty good here. 70us
> sound
> like the "normal" execution time to handle the packet, which may be
> extended to e.g 120 us (or even more) if e.g. an IRQ occurs during the
> processing.
>
>> 2 - Feenix - Windows XP Home - GPS/PPS ref. clock - 1.9GHz single core
>> Pentium 4. Most values around 20-40us, tail to 100us. Occasional
>> values
>> out to 1000us or more.
>>
>> 3 - Narvik - Windows XP Pro - LAN-synced - ~2.2GHz dual-core PC. Most
>> values 6 - 11us. Occasionally more.
>
> Similar as above. Please keep in mind timestamping is in user space
> here, so
> there may not only be IRQs but also task switches etc. which expand the
> time between packet reception and transmission of a reply.
>
>> 4 - Gemini - Windows Vista - LAN synced - ~2.2GHz dual-core PC. A
>> distribution ranging from about -1000us to +1000us, possibly triangular
>> (I'm looking on a log Y-axis).
>>
>> 5 - Puffin - Windows Vista - wifi-LAN-synced - ~2+ GHz dual-core PC.
>> Similar results to Gemini.
>>
>> 6 - Stamsund - Windows 7 - GPS/PPS ref. clock - 2.8GHz single core HT
>> Pentium 4. Most results in the range 17-25us, but with some extremes.
>>
>> 7 - Hydra - Windows 7 - LAN synced - single-core AMD 3200+. Similar
>> distribution to Gemini.
>
> For Gemini, Puffin, and Hydra:
>
> If you are running one of Dave Hart's 4.2.5 or 4.2.6 binaries then the
> clock
> interpolation may be disabled, and the system time increases in 1 ms
> steps.
>
> On the other hand, I've also seen systems where the interpolated time
> steps
> back and forth by 1 ms, due to the time passed by Windows to the timer
> APC
> callback steppingby 1 ms. I have *observed* this on XP, but I can
> imagine
> this also happens on newer systems.
>
> Please note:
>
> Even if under Vista/Windows 7 the system time increments in 1 ms steps,
> the
> nominal standard tick count is still ~15600 (15601 on a Vista machine
> here), i.e ~15.6 ms. Since this is not an integral multiple of 1 ms
> there
> must be some math which converts from 1 ms steps to 15.6 ms steps, and
> that
> math may suffer from rounding errors.
>
> AFAICS this is still the basic problem as under XP or earlier, when the
> MM
> timer has been set: The MM timer ticks at 1 ms, but the system time
> ticks
> at 15.625 ms, and there also needs to be a conversion from one tick rate
> to
> the other.
>
> The difference in Vista/7 vs. 2000/XP seems to be that
> GetSystemTimeAsFiletime returns values from the 1 ms "tick domain" for
> the
> newer systems whereas it returns values from the 15 ms "tick domain" on
> older systems.
>
>> So there are two things about these results which concern me:
>>
>> A - if my program is correct, it seems that the timestamps on the
>> Windows-7/Vista systems are being set inconsistently, in that the
>> server
>> transmit timestamp can be /before/ the receive timestamp! From the
>> distribution it seems that one timestamp is "precise" and the other a
>> Windows value based on a one millisecond (approx) timer.
>
> As I tried to explain above, this looks to me like a +/- 1 LSB (i.e.
> 1ms)
> problem. IIRC then Dave Hart has implemented some code in the clock
> interpolation routine which shall reduce the potential +/- 1 ms jitter
> in
> general. However, I'm not sure whether this routine is in effect if
> clock
> interpolation is disabled, eg. on Vista/7.
>
>> B - Why does the Windows-7 system with a GPS/PPS reference clock not
>> behave in the same way i.e. it doesn't give negative (TX-RX) times?
>
> Concerning the 1ms-to-15.6ms conversion mentioned above:
> A *possible* reason I can imagine is that this depends on whether the
> clock
> runs too fast or too slow at its nominal tick rate (i.e. the on-board
> xtal
> is below or above its nominal frequency). In one case the frequency
> drift
> compensation has to *add* an offset to the standard tick rate, in the
> other
> case an offset needs to be subtracted. Depending on the way how the
> conversion has been implemented in the Windows kernel, a positive offset
> may lead to rounding errors whereas a negative one may not, or
> vice-versa.
> All the above are only assumptions.
>
>> What I don't know is whether these results are to be expected, whether
>> they may have any effect on the operation of NTP, and whether they
>> might
>> even be the results of coding errors. I'm wondering whether this
>> indicates that something could be done to improve NTP on
>> Windows-7/Vista,
>> and whether it might even provide a further clue as to why 4.2.5
>> performs
>> worse on Windows-7/Vista than on 2000/XP.
>
> IMO it will be very hard to improve things for NTP if you do not know
> the
> exact details why this happens. The proper solution would be if the MS
> developers cared about the clock interpolation, and made the Windows
> system
> time available at a higher resolution, especially since the available
> API
> calls already support higher resolution.
>
> Those guys know how the Windows timekeeping has been coded, they know
> when
> CPU clock rates are switche to save power, etc., and they could handle
> this
> in kernel space. So timekeeping apps like NTP would not need to care
> about
> limitations of the underlying OS.
>
> Martin
Martin,
[I would normally trimmed the material above, but left it this time]
Thank you very much for your detailed and considered reply. With the
Windows-2000 and Windows-XP systems I am happy with the performance. I
was able to add the kernel-mode PPS serial routine to all the GPS/PPS
systems, which does reduce the jitter reported by NTP slightly. As you
say, though, this doesn't help the precision in timestamping the NTP
network packets.
Yes, I am running Dave Hart's binaries with the interpolation disabled and
the high-resolution timer enabled, so it just relies on the ~1KHz clock.
You make an interesting point about keeping the 1ms and 15.6ms timers in
step - that had not occurred to me before! It would be helpful to hear
from Dave Hart how the two packet timestamps are derived, when
interpolation is disabled. I've made more measurements and it looks at if
both the server-RX and the server-TX times have a 1ms jitter of an
approximately uniform peak-to-peak value, and that when subtracted the
difference has a triangular distribution centred on zero, and with a -1ms
to +1ms range.
The Windows-7 with the reference clock interests me - it's as if the
packet timestamps are being derived in a completely different way than on
the LAN-synced system. I struggle to read the source code in C, so
perhaps someone who is more familiar could confirm that.
I note your comments on Windows and agree that a lot could be done if more
information and better implementation of existing APIs were available, but
I'm still left with the question - why does NTP 4.2.5 perform less well
that 4.2.4 on Window-7, and what does the Windows-7 GPS/PPS system not
show the same 2ms peak-to-peak jitter?
I'm quite happy to work with someone offline on this, and my test program
is available.
Cheers,
David
|
|
0
|
|
|
|
Reply
|
David
|
12/11/2009 4:26:29 PM
|
|
On Dec 11, 16:26=A0UTC, "David J Taylor" wrote:
> The Windows-7 with the reference clock interests me - it's as if the
> packet timestamps are being derived in a completely different way than on
> the LAN-synced system. =A0I struggle to read the source code in C, so
> perhaps someone who is more familiar could confirm that.
I'm curious about that difference too. I can tell you that assuming
all the systems are using the -M option (so that interpolation is
disabled on all the Vista and Win7 systems) there is no difference in
the timestamping code used in the RX and TX paths, nor any difference
I can imagine in the network RX timestamp path with a network vs. PPS
refclock time source.
To answer a question that came up on hackers@, Windows does not offer
SO_TIMESTAMP or similar functionality in any release. ntpd uses its
get_systime() routine in ntp_iocompletionport.c's OnSocketRecv(), the
same routine used to fetch the TX timestamp. With interpolation
disabled, get_systime() simply calls GetSystemTimeAsFileTime() which
reads the 64-bit system time from shared memory and converts it to NTP
format. My interpretation is the differences you're seeing are likely
tied to the particular hardware and HAL being used. I suspect if you
shuffle which boxes have reference clocks, the system clock stepping
back up to a millisecond issue will affect the same systems.
Cheers,
Dave Hart
|
|
0
|
|
|
|
Reply
|
Dave
|
12/11/2009 5:17:39 PM
|
|
"Dave Hart" <davehart@gmail.com> wrote in message
news:fef13ab0-a7ca-479b-9f8f-5281fdaf7fa0@m7g2000prd.googlegroups.com...
> On Dec 11, 16:26 UTC, "David J Taylor" wrote:
>> The Windows-7 with the reference clock interests me - it's as if the
>> packet timestamps are being derived in a completely different way than
>> on
>> the LAN-synced system. I struggle to read the source code in C, so
>> perhaps someone who is more familiar could confirm that.
>
> I'm curious about that difference too. I can tell you that assuming
> all the systems are using the -M option (so that interpolation is
> disabled on all the Vista and Win7 systems) there is no difference in
> the timestamping code used in the RX and TX paths, nor any difference
> I can imagine in the network RX timestamp path with a network vs. PPS
> refclock time source.
>
> To answer a question that came up on hackers@, Windows does not offer
> SO_TIMESTAMP or similar functionality in any release. ntpd uses its
> get_systime() routine in ntp_iocompletionport.c's OnSocketRecv(), the
> same routine used to fetch the TX timestamp. With interpolation
> disabled, get_systime() simply calls GetSystemTimeAsFileTime() which
> reads the 64-bit system time from shared memory and converts it to NTP
> format. My interpretation is the differences you're seeing are likely
> tied to the particular hardware and HAL being used. I suspect if you
> shuffle which boxes have reference clocks, the system clock stepping
> back up to a millisecond issue will affect the same systems.
>
> Cheers,
> Dave Hart
Dave,
Thanks for your reply. It's not convenient to shuffle boxes at the
moment, unfortunately, but there is a different between Stamsund (P4
2.8GHz HT, ASUS P4P800) and Gemini (AMD dual-core 64 X2 4400, ASUS A8N
SLI). I just checked, and both have the -M on the startup parameters in
the Services manager. Here's what I see from NTP on the two systems:
PC Gemini:
Level,Date and Time,Source,Event ID,Task Category
Information,11/12/2009 07:16:00,NTP,3,None,Listen normally on 7 IPv6 2
fe80::44b7:ae39:8f1:1aef UDP 123
Information,11/12/2009 07:16:00,NTP,3,None,Listen normally on 6 IPv6 1
fe80::1d76:410f:c600:725 UDP 123
Information,11/12/2009 07:16:00,NTP,3,None,Listen normally on 5 v6loop 0
::1 UDP 123
Information,11/12/2009 07:16:00,NTP,3,None,Listen normally on 4 IPv4 3
192.168.0.5 UDP 123
Information,11/12/2009 07:16:00,NTP,3,None,Listen normally on 3 IPv4 2
192.168.238.238 UDP 123
Information,11/12/2009 07:16:00,NTP,3,None,Listen normally on 2 v4loop 1
127.0.0.1 UDP 123
Information,11/12/2009 07:16:00,NTP,3,None,Listen and drop on 1 v6wildcard
:: UDP 123
Information,11/12/2009 07:16:00,NTP,3,None,Listen and drop on 0 v4wildcard
0.0.0.0 UDP 123
Information,11/12/2009 07:16:00,NTP,3,None,proto: precision = 976.500 usec
Information,11/12/2009 07:16:00,NTP,3,None,using Windows clock directly
Information,11/12/2009 07:16:00,NTP,3,None,"Windows clock precision 0.977
msec, min. slew 6.400 ppm/s "
Information,11/12/2009 07:16:00,NTP,3,None,Clock interrupt period 15.625
msec
Information,11/12/2009 07:16:00,NTP,3,None,Performance counter frequency
3.580 MHz
Information,11/12/2009 07:16:00,NTP,3,None,"MM timer resolution:
1..1000000 msec, set to 1 msec "
Information,11/12/2009 07:16:00,NTP,3,None,Raised to realtime priority
class
Information,11/12/2009 07:16:00,NTP,3,None,ntpd 4.2.6-o Dec 09 11:48:30.27
(UTC-00:00) 2009 (1)
PC Stamsund:
Level,Date and Time,Source,Event ID,Task Category
Warning,09/12/2009 20:32:31,NTP,2,None,"clock would have gone backward 3
times, max 97.5 usec "
Information,09/12/2009 14:14:36,NTP,3,None,Using user-mode PPS timestamp
for GPS_NMEA(1)
Information,09/12/2009 14:14:35,NTP,3,None,GPS_NMEA(1) serial /dev/gps1
open at 4800 bps
Information,09/12/2009 14:14:35,NTP,3,None,Listen normally on 6 IPv6 1
fe80::5dbb:3fda:3db5:b6be UDP 123
Information,09/12/2009 14:14:35,NTP,3,None,Listen normally on 5 v6loop 0
::1 UDP 123
Information,09/12/2009 14:14:35,NTP,3,None,Listen normally on 4 IPv4 3
192.168.238.238 UDP 123
Information,09/12/2009 14:14:35,NTP,3,None,Listen normally on 3 IPv4 2
192.168.0.7 UDP 123
Information,09/12/2009 14:14:35,NTP,3,None,Listen normally on 2 v4loop 1
127.0.0.1 UDP 123
Information,09/12/2009 14:14:35,NTP,3,None,Listen and drop on 1 v6wildcard
:: UDP 123
Information,09/12/2009 14:14:34,NTP,3,None,Listen and drop on 0 v4wildcard
0.0.0.0 UDP 123
Information,09/12/2009 14:14:34,NTP,3,None,proto: precision = 1.700 usec
Information,09/12/2009 14:14:32,NTP,3,None,HZ 64.000 using 43 msec timer
23.256 Hz 64 deep
Information,09/12/2009 14:14:32,NTP,3,None,"Windows clock precision 0.977
msec, min. slew 6.400 ppm/s "
Information,09/12/2009 14:14:32,NTP,3,None,Clock interrupt period 15.625
msec (startup slew 0.1 usec/period)
Information,09/12/2009 14:14:32,NTP,3,None,Performance counter frequency
3.580 MHz
Information,09/12/2009 14:14:32,NTP,3,None,"MM timer resolution:
1..1000000 msec, set to 1 msec "
Information,09/12/2009 14:14:32,NTP,3,None,Raised to realtime priority
class
Information,09/12/2009 14:14:32,NTP,3,None,ntpd 4.2.6-o Dec 09 11:48:30.27
(UTC-00:00) 2009 (1)
I note that the line: "using Windows clock directly" appears in Gemini
and not Stamsund, and that "HZ 64.000 using 43 msec timer 23.256 Hz 64
deep" appears in Stamsund and not Gemini. Stamsund also has
"NTP_USER_INTERP_DANGEROUS=1" which must have been a hangover from our
earlier experiments.
Perhaps this means I'm running Stamsund in a non-standard mode, without
having remembered I was, and what is the significance that it appears to
work well as a reference server (although nothing like as well as on
Windows XP)?
Cheers,
David
|
|
0
|
|
|
|
Reply
|
David
|
12/11/2009 6:31:28 PM
|
|
On 2009-12-11, Martin Burnicki <martin.burnicki@meinberg.de> wrote:
> David,
>
> David J Taylor wrote:
>> Folks,
>>
>> [Posted to NTP Hackers, but no reaction there as yet]
>>
>> I've written a small program which sends some SNTP packets to various NTP
>> servers on my LAN, and looks at the timestamps which the server adds as
>> its RX and TX times. With Windows-7 and Vista I'm seeing some odd
>> results. I'm measuring (server TX) - (server RX) time.
>>
>> 1 - Bacchus - Windows 2000 Server - GPS/PPS ref. clock - older computer -
>> 550MHz Pentium III. Most values around 70 us (microseconds) with a tail
>> up to about 120us.
>
> This looks like the clock interpolation works pretty good here. 70us sound
> like the "normal" execution time to handle the packet, which may be
> extended to e.g 120 us (or even more) if e.g. an IRQ occurs during the
> processing.
>
>> 2 - Feenix - Windows XP Home - GPS/PPS ref. clock - 1.9GHz single core
>> Pentium 4. Most values around 20-40us, tail to 100us. Occasional values
>> out to 1000us or more.
>>
>> 3 - Narvik - Windows XP Pro - LAN-synced - ~2.2GHz dual-core PC. Most
>> values 6 - 11us. Occasionally more.
>
> Similar as above. Please keep in mind timestamping is in user space here, so
> there may not only be IRQs but also task switches etc. which expand the
> time between packet reception and transmission of a reply.
>
>> 4 - Gemini - Windows Vista - LAN synced - ~2.2GHz dual-core PC. A
>> distribution ranging from about -1000us to +1000us, possibly triangular
>> (I'm looking on a log Y-axis).
>>
>> 5 - Puffin - Windows Vista - wifi-LAN-synced - ~2+ GHz dual-core PC.
>> Similar results to Gemini.
>>
>> 6 - Stamsund - Windows 7 - GPS/PPS ref. clock - 2.8GHz single core HT
>> Pentium 4. Most results in the range 17-25us, but with some extremes.
>>
>> 7 - Hydra - Windows 7 - LAN synced - single-core AMD 3200+. Similar
>> distribution to Gemini.
>
> For Gemini, Puffin, and Hydra:
>
> If you are running one of Dave Hart's 4.2.5 or 4.2.6 binaries then the clock
> interpolation may be disabled, and the system time increases in 1 ms steps.
>
> On the other hand, I've also seen systems where the interpolated time steps
> back and forth by 1 ms, due to the time passed by Windows to the timer APC
> callback steppingby 1 ms. I have *observed* this on XP, but I can imagine
> this also happens on newer systems.
>
> Please note:
>
> Even if under Vista/Windows 7 the system time increments in 1 ms steps, the
> nominal standard tick count is still ~15600 (15601 on a Vista machine
> here), i.e ~15.6 ms. Since this is not an integral multiple of 1 ms there
> must be some math which converts from 1 ms steps to 15.6 ms steps, and that
> math may suffer from rounding errors.
>
> AFAICS this is still the basic problem as under XP or earlier, when the MM
> timer has been set: The MM timer ticks at 1 ms, but the system time ticks
> at 15.625 ms, and there also needs to be a conversion from one tick rate to
> the other.
>
> The difference in Vista/7 vs. 2000/XP seems to be that
> GetSystemTimeAsFiletime returns values from the 1 ms "tick domain" for the
> newer systems whereas it returns values from the 15 ms "tick domain" on
> older systems.
>
>> So there are two things about these results which concern me:
>>
>> A - if my program is correct, it seems that the timestamps on the
>> Windows-7/Vista systems are being set inconsistently, in that the server
>> transmit timestamp can be /before/ the receive timestamp! From the
>> distribution it seems that one timestamp is "precise" and the other a
>> Windows value based on a one millisecond (approx) timer.
>
> As I tried to explain above, this looks to me like a +/- 1 LSB (i.e. 1ms)
> problem. IIRC then Dave Hart has implemented some code in the clock
> interpolation routine which shall reduce the potential +/- 1 ms jitter in
> general. However, I'm not sure whether this routine is in effect if clock
> interpolation is disabled, eg. on Vista/7.
>
>> B - Why does the Windows-7 system with a GPS/PPS reference clock not
>> behave in the same way i.e. it doesn't give negative (TX-RX) times?
>
> Concerning the 1ms-to-15.6ms conversion mentioned above:
> A *possible* reason I can imagine is that this depends on whether the clock
> runs too fast or too slow at its nominal tick rate (i.e. the on-board xtal
> is below or above its nominal frequency). In one case the frequency drift
> compensation has to *add* an offset to the standard tick rate, in the other
> case an offset needs to be subtracted. Depending on the way how the
> conversion has been implemented in the Windows kernel, a positive offset
> may lead to rounding errors whereas a negative one may not, or vice-versa.
> All the above are only assumptions.
>
>> What I don't know is whether these results are to be expected, whether
>> they may have any effect on the operation of NTP, and whether they might
>> even be the results of coding errors. I'm wondering whether this
>> indicates that something could be done to improve NTP on Windows-7/Vista,
>> and whether it might even provide a further clue as to why 4.2.5 performs
>> worse on Windows-7/Vista than on 2000/XP.
>
> IMO it will be very hard to improve things for NTP if you do not know the
> exact details why this happens. The proper solution would be if the MS
> developers cared about the clock interpolation, and made the Windows system
> time available at a higher resolution, especially since the available API
> calls already support higher resolution.
>
> Those guys know how the Windows timekeeping has been coded, they know when
> CPU clock rates are switche to save power, etc., and they could handle this
> in kernel space. So timekeeping apps like NTP would not need to care about
> limitations of the underlying OS.
>
> Martin
And given these results would the advice given by some in this list
to go ahead and use Windows as a time server still stand?
(Of course it depends on the accuracy required. For 1 sec accuracy, it
looks like it should be fine.)
|
|
0
|
|
|
|
Reply
|
unruh
|
12/11/2009 7:30:01 PM
|
|
"unruh" <unruh@wormhole.physics.ubc.ca> wrote in message
news:slrnhi57dp.iod.unruh@wormhole.physics.ubc.ca...
[]
> And given these results would the advice given by some in this list
> to go ahead and use Windows as a time server still stand?
>
> (Of course it depends on the accuracy required. For 1 sec accuracy, it
> looks like it should be fine.)
Put a different record on, Bill! If one millisecond is all you need, and
you don't want to add an extra PC with a different OS, then a properly
setup Windows system is just fine. See:
http://www.satsignal.eu/mrtg/feenix_ntp_2.html
http://www.satsignal.eu/mrtg/stamsund_ntp_2.html
David
|
|
0
|
|
|
|
Reply
|
David
|
12/11/2009 8:31:40 PM
|
|
David J Taylor wrote:
> A - if my program is correct, it seems that the
> timestamps on the Windows-7/Vista systems are being
> set inconsistently, in that the server transmit
> timestamp can be /before/ the receive timestamp!
Isn't that one of the checks for a false ticker?
{I think it was at least mentioned in one of the RFCs,
I'll have to take a look in the source.}
--
E-Mail Sent to this address <BlackList@Anitech-Systems.com>
will be added to the BlackLists.
|
|
0
|
|
|
|
Reply
|
E
|
12/11/2009 8:56:15 PM
|
|
"Martin Burnicki" <martin.burnicki@meinberg.de> wrote in message
news:s9u9v6-gj3.ln1@gateway.py.meinberg.de...
[]
> On the other hand, I've also seen systems where the interpolated time
> steps
> back and forth by 1 ms, due to the time passed by Windows to the timer
> APC
> callback steppingby 1 ms. I have *observed* this on XP, but I can
> imagine
> this also happens on newer systems.
[]
> Martin
Martin,
I've added a test to my program to see whether the value from
GetSystemTimeAsFileTime ever steps backwards, and it hasn't done so in my
testing so far. So I remain unsure why, on a system without the
interpolation code, I'm seeing the TX time before the RX time. Certainly
would be nice to have someone confirm the result - a default installation
on a Vista or Windows-7 should show this.
Cheers,
David
|
|
0
|
|
|
|
Reply
|
David
|
12/11/2009 8:57:50 PM
|
|
On Dec 11, 6:31=A0pm, "David J Taylor" wrote:
> I note that the line: =A0"using Windows clock directly" =A0appears in Gem=
ini
> and not Stamsund, and that "HZ 64.000 using 43 msec timer 23.256 Hz 64
> deep" appears in Stamsund and not Gemini. =A0Stamsund also has
> "NTP_USER_INTERP_DANGEROUS=3D1" which must have been a hangover from our
> earlier experiments.
So it seems. You may be the only person to use that environment
variable (though I'm pretty sure it's not spelled quite right there).
> Perhaps this means I'm running Stamsund in a non-standard mode, without
> having remembered I was, and what is the significance that it appears to
> work well as a reference server (although nothing like as well as on
> Windows XP)?
It's actually very interesting to me, and I'm glad you reminded me of
it. It raises the question why is it interpolation is not horribly
broken on this system with a 1ms resolution system clock, given that
we know the scheduler resolution on all the known Windows versions is
1ms? I thought the problem that broke interpolation on Win7 and Vista
systems with the system clock precision driven to 0.5 or 1ms was
caused by the sampling of clock and counter pairs occurring in phase
with the clock updates, because the interpolation scheme wants it
samples well-distributed so there is always at least one sample in the
last second or two that happened to be taken soon after the clock
ticked to a new value.
The fact that is working despite the 1ms system clock means I don't
understand the breakage as well as I thought, and hints of a
possibility interpolation could be made to work on more or all Vista/7
systems.
Cheers,
Dave Hart
|
|
0
|
|
|
|
Reply
|
Dave
|
12/12/2009 6:32:07 AM
|
|
"Dave Hart" <davehart@gmail.com> wrote in message
news:98c443b7-15b4-4472-b985-9c3e99788835@j9g2000prh.googlegroups.com...
> On Dec 11, 6:31 pm, "David J Taylor" wrote:
>> I note that the line: "using Windows clock directly" appears in
>> Gemini
>> and not Stamsund, and that "HZ 64.000 using 43 msec timer 23.256 Hz 64
>> deep" appears in Stamsund and not Gemini. Stamsund also has
>> "NTP_USER_INTERP_DANGEROUS=1" which must have been a hangover from our
>> earlier experiments.
>
> So it seems. You may be the only person to use that environment
> variable (though I'm pretty sure it's not spelled quite right there).
>
>> Perhaps this means I'm running Stamsund in a non-standard mode, without
>> having remembered I was, and what is the significance that it appears
>> to
>> work well as a reference server (although nothing like as well as on
>> Windows XP)?
>
> It's actually very interesting to me, and I'm glad you reminded me of
> it. It raises the question why is it interpolation is not horribly
> broken on this system with a 1ms resolution system clock, given that
> we know the scheduler resolution on all the known Windows versions is
> 1ms? I thought the problem that broke interpolation on Win7 and Vista
> systems with the system clock precision driven to 0.5 or 1ms was
> caused by the sampling of clock and counter pairs occurring in phase
> with the clock updates, because the interpolation scheme wants it
> samples well-distributed so there is always at least one sample in the
> last second or two that happened to be taken soon after the clock
> ticked to a new value.
>
> The fact that is working despite the 1ms system clock means I don't
> understand the breakage as well as I thought, and hints of a
> possibility interpolation could be made to work on more or all Vista/7
> systems.
>
> Cheers,
> Dave Hart
Dave,
It isn't spelt correctly - I was walking from one system to another and
remembering the string in my head (a mistake!). It's actually:
NTPD_USE_INTERP_DANGEROUS=1
on a second check (give or take more walking errors).
The major difference between the two systems is that one has a ref-clock
attached and the other doesn't. Plus, as you noted, HAL and hardware
differences.
I can test any 4.2.7 you want to pass me on both the LAN-synced Vista PC
(which has these apparent "server received after transmitted timestamps")
and the GPS-PPS-kernel-serial Windows-7 system, although I would prefer
not to have to reboot if at all possible. I could easily remove the
ref-clock from the Windows-7 system (Stamsund), and the lead might then
stretch to the Vista system (Gemini).
Cheers,
David
|
|
0
|
|
|
|
Reply
|
David
|
12/12/2009 8:31:23 AM
|
|
"Dave Hart" <> wrote in message
news:98c443b7-15b4-4472-b985-9c3e99788835@j9g2000prh.googlegroups.com...
[]
> The fact that is working despite the 1ms system clock means I don't
> understand the breakage as well as I thought, and hints of a
> possibility interpolation could be made to work on more or all Vista/7
> systems.
>
> Cheers,
> Dave Hart
The other point on Vista (and Windows-7 - probably) was that sometimes my
badly-behaved PC would behave well for some hours after a restart (or just
a restart of NTP?), and only then would the large glitches show up. Don't
know whether that helps?
Cheers,
David
|
|
0
|
|
|
|
Reply
|
David
|
12/12/2009 9:00:01 AM
|
|
unruh wrote:
> And given these results would the advice given by some in this list
> to go ahead and use Windows as a time server still stand?
>
> (Of course it depends on the accuracy required. For 1 sec accuracy, it
> looks like it should be fine.)
Yes, that's exactly the point, as also mentioned by David J. Taylor.
You *can* run ntpd on Windows machines and use them as a time server, if the
resulting accuracy is sufficient, e.g. if the clients are also Windows
workstations which suffer under the same timelkeeping limitations as the
server.
Undoubtfully there are better ways to set up a NTP server using a Unix-like
OS in a mixed environment.
BTW, there is currently a bug in the NTP version shipped with openSUSE 11.2
which lets the measured time differences reported in the loopstats files
bounce from -0.5 ms to +0.5 ms, i.e. the granularity is degraded to 1ms.
This is reported in:
https://bugzilla.novell.com/show_bug.cgi?id=557716
Looks like this is an openSUSE specific problem to work around a problem
reported in NTP's bug #1219:
https://support.ntp.org/bugs/show_bug.cgi?id=1219
The good news here is that this can be fixed since the code which does the
timekeeping is known (both NTP and the kernel), whereas under Windows NTP
needs to workaround limitations of the Windows kernel the code of which is
not known, so the workarounds can only be based on assumptions how the
kernel works.
Martin
--
Martin Burnicki
Meinberg Funkuhren
Bad Pyrmont
Germany
|
|
0
|
|
|
|
Reply
|
Martin
|
12/14/2009 11:07:50 AM
|
|
David J Taylor wrote:
> "Martin Burnicki" <martin.burnicki@meinberg.de> wrote in message
> news:s9u9v6-gj3.ln1@gateway.py.meinberg.de...
[...]
>> Concerning the 1ms-to-15.6ms conversion mentioned above:
>> A *possible* reason I can imagine is that this depends on whether the
>> clock
>> runs too fast or too slow at its nominal tick rate (i.e. the on-board
>> xtal
>> is below or above its nominal frequency). In one case the frequency
>> drift
>> compensation has to *add* an offset to the standard tick rate, in the
>> other
>> case an offset needs to be subtracted. Depending on the way how the
>> conversion has been implemented in the Windows kernel, a positive offset
>> may lead to rounding errors whereas a negative one may not, or
>> vice-versa.
>> All the above are only assumptions.
[...]
> Thank you very much for your detailed and considered reply. With the
> Windows-2000 and Windows-XP systems I am happy with the performance. I
> was able to add the kernel-mode PPS serial routine to all the GPS/PPS
> systems, which does reduce the jitter reported by NTP slightly. As you
> say, though, this doesn't help the precision in timestamping the NTP
> network packets.
>
> Yes, I am running Dave Hart's binaries with the interpolation disabled and
> the high-resolution timer enabled, so it just relies on the ~1KHz clock.
> You make an interesting point about keeping the 1ms and 15.6ms timers in
> step - that had not occurred to me before!
The Windows API call used to slew the system time reports a standard tick
rate of 15.6001 on a Vista machine here even if the time returned by a loop
of GetSystemTimeAsFiletime() calls increments in 1.000 ms steps.
So what happens if the default tick rate of 15.6001 ms is actually modified
to compensate the clock drift, e.g. by +5 -> 15.6006 to speed up the system
clock, or e.g. by -5 -> 15.5996 ms to slew it down?
Can you check if the frequency offset measured by ntpd on the system which
has a TX time before the RX time has a different sign as the frequency
offset measured on those systems which work "good"?
The log_adj utility I've written some time ago
http://www.meinberg.de/download/utils/windows/log_adj-1.4.zip
also reports whether the adjustment applied to the standard clock tick is
positive, or negative, and which magnitude it is. This may be relevant for
the reason why the problem occurs.
> I'm quite happy to work with someone offline on this, and my test program
> is available.
If you send me that program (or a link to it) I can give it a try.
Martin
--
Martin Burnicki
Meinberg Funkuhren
Bad Pyrmont
Germany
|
|
0
|
|
|
|
Reply
|
Martin
|
12/14/2009 11:40:02 AM
|
|
"Martin Burnicki" <> wrote in message
news:ipkhv6-f29.ln1@gateway.py.meinberg.de...
[]
> The Windows API call used to slew the system time reports a standard
> tick
> rate of 15.6001 on a Vista machine here even if the time returned by a
> loop
> of GetSystemTimeAsFiletime() calls increments in 1.000 ms steps.
>
> So what happens if the default tick rate of 15.6001 ms is actually
> modified
> to compensate the clock drift, e.g. by +5 -> 15.6006 to speed up the
> system
> clock, or e.g. by -5 -> 15.5996 ms to slew it down?
>
> Can you check if the frequency offset measured by ntpd on the system
> which
> has a TX time before the RX time has a different sign as the frequency
> offset measured on those systems which work "good"?
>
> The log_adj utility I've written some time ago
> http://www.meinberg.de/download/utils/windows/log_adj-1.4.zip
> also reports whether the adjustment applied to the standard clock tick
> is
> positive, or negative, and which magnitude it is. This may be relevant
> for
> the reason why the problem occurs.
>
>> I'm quite happy to work with someone offline on this, and my test
>> program
>> is available.
>
> If you send me that program (or a link to it) I can give it a try.
>
> Martin
Martin,
Thanks for your comments.
I'm not sure what frequency offset you are asking me to check - what NTP
reports? The PCs have statistic logging enabled so I can check that. I'm
just running log_adj on the two Windows-7 systems to see what there is to
report.
I'll tidy up the test program I have a little and post back here with a
link.
Cheers,
David
|
|
0
|
|
|
|
Reply
|
David
|
12/14/2009 3:06:07 PM
|
|
"Martin Burnicki" <> wrote in message
news:ipkhv6-f29.ln1@gateway.py.meinberg.de...
[]
> If you send me that program (or a link to it) I can give it a try.
>
> Martin
Martin, I've just e-mailed:
- the program
- its source (Delphi 2009)
- the results from log_adj on the two PCs in case you can see anything
significant.
I hope you e-mail accepts Zip with .exe files included.
Cheers,
David
|
|
0
|
|
|
|
Reply
|
David
|
12/14/2009 3:30:49 PM
|
|
> Martin,
>
> I've added a test to my program to see whether the value from
> GetSystemTimeAsFileTime ever steps backwards, and it hasn't done so in
> my testing so far. So I remain unsure why, on a system without the
> interpolation code, I'm seeing the TX time before the RX time.
> Certainly would be nice to have someone confirm the result - a default
> installation on a Vista or Windows-7 should show this.
>
> Cheers,
> David
There is a brief write-up of the program, and sample plots from "good" and
"bad" NTP servers, here:
http://www.satsignal.eu/ntp/NTP-timestamp.html
Martin has been able to confirm the timestamp sequence issue using
different test software running on a different OS.
Cheers,
David
|
|
0
|
|
|
|
Reply
|
David
|
12/15/2009 2:36:25 PM
|
|
|
16 Replies
377 Views
(page loaded in 1.705 seconds)
Similiar Articles: Packet timestamps when using Windows-7/Vista - comp.protocols.time ...Folks, [Posted to NTP Hackers, but no reaction there as yet] I've written a small program which sends some SNTP packets to various NTP servers... Interrupt Service Routine (ISR) Vs Subroutine, Differences - comp ...Packet timestamps when using Windows-7/Vista - comp.protocols.time ... The difference in Vista/7 vs. 2000/XP seems to be that ... 11/12/2009 07:16:00,NTP,3,None,Clock ... Reading packets into a buffer - comp.os.linux.networking ...Packet timestamps when using Windows-7/Vista - comp.protocols.time ... Reading packets into a buffer - comp.os.linux.networking ..... writing a program that reads a large ... A possible bug in Windows Server 2008? - comp.soft-sys.matlab ...Packet timestamps when using Windows-7/Vista - comp.protocols.time ... BTW, there is currently a bug in the NTP ... Windows Vista and Windows Server 2008 contain a ... ... What is the max length for a string sent through WM_COPYDATA ...Packet timestamps when using Windows-7/Vista - comp.protocols.time ..... None,"clock would have gone backward 3 times, max 97.5 ... -- E-Mail Sent to this address ... what happens with overload on system for NTP - comp.protocols.time ...Packet timestamps when using Windows-7/Vista - comp.protocols.time ... Folks, [Posted to NTP Hackers, but no reaction there as yet] I've ... I have *observed* this on ... Meinberg NTP Software--Time Accuracy - comp.protocols.time.ntp ...Packet timestamps when using Windows-7/Vista - comp.protocols.time ... Meinberg NTP Software--Time Accuracy - comp.protocols.time.ntp ... In some science working I'm doing ... Deleting from sys.aud$ - comp.databases.oracle.serverPacket timestamps when using Windows-7/Vista - comp.protocols.time ... I could easily remove the ref-clock from the Windows-7 system (Stamsund), and the ... reading a .csv file with a space after negative sign - comp.soft ...Packet timestamps when using Windows-7/Vista - comp.protocols.time ... Please keep in mind timestamping is in user space ... TX time before the RX time has a different sign ... Getting the TX/RX counters from SNMP - comp.dcom.sys.cisco ...Packet timestamps when using Windows-7/Vista - comp.protocols.time ..... on my LAN, and looks at the timestamps which the server adds as its RX and TX times ... Setting up meinberg's NTP server on a private network.. - comp ...Packet timestamps when using Windows-7/Vista - comp.protocols.time ..... in timestamping the NTP network packets. Yes, I am running Dave Hart's ... there are better ways ... High resolution timer. - comp.lang.asm.x86Packet timestamps when using Windows-7/Vista - comp.protocols.time ... Yes, I am running Dave Hart's binaries with the interpolation disabled and the high-resolution timer ... still not able to get NTP to sync on windows 7 even w/ more ...Packet timestamps when using Windows-7/Vista - comp.protocols.time ..... be done to improve NTP on Windows-7/Vista, and whether it might even ... to e.g 120 us (or even ... Time sync with milliseconds with Windows XP - comp.protocols.time ...Packet timestamps when using Windows-7/Vista - comp.protocols.time ..... our SNTP server and passing through time stamper, packet ... At 7:14 PM +0900 2005-09-13 ... USB as standard debug interface - comp.arch.embeddedPacket timestamps when using Windows-7/Vista - comp.protocols.time ..... time increments in 1 ms steps, the nominal standard tick ... help with setting up NTP on windows ... Packet timestamps when using Windows-7/Vista - comp.protocols.time ...Folks, [Posted to NTP Hackers, but no reaction there as yet] I've written a small program which sends some SNTP packets to various NTP servers... Packet timestamps when using Windows-7/Vista - comp.protocols.time ...The old Google Groups will be going away soon, but your browser is incompatible with the new version. 7/23/2012 12:57:40 PM
|