RC freezes at startup

  • Follow


Hello,

I compiled ntp-dev-4.2.1p247-RC on a Linux Debian Woody where
ntp-4.2.0b-rc1 runs OK. At startup, the daemon soon freezes: No reply to
any requests, and nothing more in syslog (past the first startup
messages). When started with -d, the daemon repeatably shows normal
startup lines, until the 1st network packet is received. At this precise
moment begins a fast never ending storm of this repeated line:

| receive: at 3 192.168.7.10<-192.168.7.5 mode 4 code 1 auth 0

The storm is virtual, though: The network saw one packet only.
192.168.7.5 is a server declared iburst in ntp.conf. If removed, the
storm will start with another packet, perhaps a peer:

| receive: at 14 192.168.7.10<-192.168.7.3 mode 1 code 1 keyid 00000001 len 48 mac 20 auth 1

Or sometimes 2 lines are repeated:

| receive: at 5 192.168.7.10<-192.168.7.13 mode 3 code 3 auth 0
| transmit: at 5 192.168.7.10->192.168.7.13 mode 4

Back to RC1 restores normal operations.


Serge.
-- 
Serge point Bets arobase laposte point net
0
Reply Serge 5/7/2006 7:36:33 PM

Serge Bets wrote:
> Hello,
> 
> I compiled ntp-dev-4.2.1p247-RC on a Linux Debian Woody where
> ntp-4.2.0b-rc1 runs OK. At startup, the daemon soon freezes: No reply to
> any requests, and nothing more in syslog (past the first startup
> messages). When started with -d, the daemon repeatably shows normal
> startup lines, until the 1st network packet is received. At this precise
> moment begins a fast never ending storm of this repeated line:
> 
> | receive: at 3 192.168.7.10<-192.168.7.5 mode 4 code 1 auth 0
> 

Please try with -D2 and not just -d. It should print out more
information. If it is what I think then you are just looping through the
same thing over and over again and you never get out.

Danny

> The storm is virtual, though: The network saw one packet only.
> 192.168.7.5 is a server declared iburst in ntp.conf. If removed, the
> storm will start with another packet, perhaps a peer:
> 
> | receive: at 14 192.168.7.10<-192.168.7.3 mode 1 code 1 keyid 00000001 len 48 mac 20 auth 1
> 
> Or sometimes 2 lines are repeated:
> 
> | receive: at 5 192.168.7.10<-192.168.7.13 mode 3 code 3 auth 0
> | transmit: at 5 192.168.7.10->192.168.7.13 mode 4
> 
> Back to RC1 restores normal operations.
> 
> 
> Serge.

_______________________________________________
questions mailing list
questions@lists.ntp.isc.org
https://lists.ntp.isc.org/mailman/listinfo/questions

0
Reply mayer 5/8/2006 3:26:33 AM


Danny Mayer wrote:
> Serge Bets wrote:
>> Hello,
>>
>> I compiled ntp-dev-4.2.1p247-RC on a Linux Debian Woody where
>> ntp-4.2.0b-rc1 runs OK. At startup, the daemon soon freezes: No reply to
>> any requests, and nothing more in syslog (past the first startup
>> messages). When started with -d, the daemon repeatably shows normal
>> startup lines, until the 1st network packet is received. At this precise
>> moment begins a fast never ending storm of this repeated line:
>>
>> | receive: at 3 192.168.7.10<-192.168.7.5 mode 4 code 1 auth 0
>>
> 
> Please try with -D2 and not just -d. It should print out more
> information. If it is what I think then you are just looping through the
> same thing over and over again and you never get out.
> 

Also change the following line in ntpd (line 961) from:

		while (full_recvbuffs())
to:

		while (full_recvbuffs() > 0)

it's badly coded.

Danny

> 
>> The storm is virtual, though: The network saw one packet only.
>> 192.168.7.5 is a server declared iburst in ntp.conf. If removed, the
>> storm will start with another packet, perhaps a peer:
>>
>> | receive: at 14 192.168.7.10<-192.168.7.3 mode 1 code 1 keyid 00000001 len 48 mac 20 auth 1
>>
>> Or sometimes 2 lines are repeated:
>>
>> | receive: at 5 192.168.7.10<-192.168.7.13 mode 3 code 3 auth 0
>> | transmit: at 5 192.168.7.10->192.168.7.13 mode 4
>>
>> Back to RC1 restores normal operations.
>>
>>
>> Serge.
> 
> _______________________________________________
> questions mailing list
> questions@lists.ntp.isc.org
> https://lists.ntp.isc.org/mailman/listinfo/questions
> 

_______________________________________________
questions mailing list
questions@lists.ntp.isc.org
https://lists.ntp.isc.org/mailman/listinfo/questions

0
Reply mayer 5/8/2006 3:32:29 AM

Serge,

I've found and reported this on more than one occasion. It happens only 
with Linux and for some reason not in every NTP release. Somebody 
changes some dinky completely unrelated thing or other and the problem 
goes away.

Dave

Serge Bets wrote:

> Hello,
> 
> I compiled ntp-dev-4.2.1p247-RC on a Linux Debian Woody where
> ntp-4.2.0b-rc1 runs OK. At startup, the daemon soon freezes: No reply to
> any requests, and nothing more in syslog (past the first startup
> messages). When started with -d, the daemon repeatably shows normal
> startup lines, until the 1st network packet is received. At this precise
> moment begins a fast never ending storm of this repeated line:
> 
> | receive: at 3 192.168.7.10<-192.168.7.5 mode 4 code 1 auth 0
> 
> The storm is virtual, though: The network saw one packet only.
> 192.168.7.5 is a server declared iburst in ntp.conf. If removed, the
> storm will start with another packet, perhaps a peer:
> 
> | receive: at 14 192.168.7.10<-192.168.7.3 mode 1 code 1 keyid 00000001 len 48 mac 20 auth 1
> 
> Or sometimes 2 lines are repeated:
> 
> | receive: at 5 192.168.7.10<-192.168.7.13 mode 3 code 3 auth 0
> | transmit: at 5 192.168.7.10->192.168.7.13 mode 4
> 
> Back to RC1 restores normal operations.
> 
> 
> Serge.
0
Reply David 5/8/2006 4:11:43 AM

I'm seeing the same problem on Sun SPARC hardware with Solaris 9.

Paul

----------------------------------------

David L. Mills schrieb:

> Serge,
>
> I've found and reported this on more than one occasion. It happens only
> with Linux and for some reason not in every NTP release. Somebody
> changes some dinky completely unrelated thing or other and the problem
> goes away.
> 
> Dave

0
Reply Paul 5/8/2006 10:08:07 AM

Hello Danny,

 On Monday, May 8, 2006 at 3:32:29 +0000, Danny Mayer wrote:

> Danny Mayer wrote:
>> Please try with -D2 and not just -d. It should print out more
>> information. If it is what I think then you are just looping through
>> the same thing over and over again and you never get out.

Here it is with -D2, beginning just before the storm:

| sendpkt(fd=22 dst=192.168.7.5, src=192.168.7.10, ttl=0, len=48)
| transmit: at 3 192.168.7.10->192.168.7.5 mode 3
| poll_update: at 3 192.168.7.5 flags 0601 poll 6 burst 8 last 3 next 5
| get_full_recv_buffer() called and full_recvbufs is 1
| receive: at 3 192.168.7.10<-192.168.7.5 flags 19 restrict 000
| receive: at 3 192.168.7.10<-192.168.7.5 mode 4 code 1 auth 0
| addto_syslog: peer 192.168.7.5 event 'event_reach' (0x84) status 'unreach, conf, 1 event, event_reach' (0x8014)
| peer 192.168.7.5 event 'event_reach' (0x84) status 'unreach, conf, 1 event, event_reach' (0x8014)
| clock_filter: n 1 off -0.009194 del 0.003643 dsp 7.939454 jit 0.000002, age 0
| filegen  2 3356085458 0 3356035200
| receive: at 3 192.168.7.10<-192.168.7.5 flags 19 restrict 000
| receive: at 3 192.168.7.10<-192.168.7.5 mode 4 code 1 auth 0

And the last 2 lines are repeated thousands times per second, until ^C.


> Also change the following line in ntpd (line 961) from:
> while (full_recvbuffs()) to: while (full_recvbuffs() > 0)

No such line in ntp-dev-4.2.1p247-RC? ntpd/ntpd.c doesn't seem to
contain any "full_recvbuffs". Lines 961 and next are:

| rbuf = get_full_recv_buffer();
| while (rbuf != NULL)

I don't know if it's relevant, but a recursive grep finds
"full_recvbufs" spelled both with one and two "f".


Serge.
-- 
Serge point Bets arobase laposte point net
0
Reply Serge 5/8/2006 2:50:34 PM

Serge Bets wrote:
> Hello Danny,
> 
>  On Monday, May 8, 2006 at 3:32:29 +0000, Danny Mayer wrote:
> 
>> Danny Mayer wrote:
>>> Please try with -D2 and not just -d. It should print out more
>>> information. If it is what I think then you are just looping through
>>> the same thing over and over again and you never get out.
> 
> Here it is with -D2, beginning just before the storm:
> 
> | sendpkt(fd=22 dst=192.168.7.5, src=192.168.7.10, ttl=0, len=48)
> | transmit: at 3 192.168.7.10->192.168.7.5 mode 3
> | poll_update: at 3 192.168.7.5 flags 0601 poll 6 burst 8 last 3 next 5
> | get_full_recv_buffer() called and full_recvbufs is 1
> | receive: at 3 192.168.7.10<-192.168.7.5 flags 19 restrict 000
> | receive: at 3 192.168.7.10<-192.168.7.5 mode 4 code 1 auth 0
> | addto_syslog: peer 192.168.7.5 event 'event_reach' (0x84) status 'unreach, conf, 1 event, event_reach' (0x8014)
> | peer 192.168.7.5 event 'event_reach' (0x84) status 'unreach, conf, 1 event, event_reach' (0x8014)
> | clock_filter: n 1 off -0.009194 del 0.003643 dsp 7.939454 jit 0.000002, age 0
> | filegen  2 3356085458 0 3356035200
> | receive: at 3 192.168.7.10<-192.168.7.5 flags 19 restrict 000
> | receive: at 3 192.168.7.10<-192.168.7.5 mode 4 code 1 auth 0
> 
> And the last 2 lines are repeated thousands times per second, until ^C.
> 
> 
>> Also change the following line in ntpd (line 961) from:
>> while (full_recvbuffs()) to: while (full_recvbuffs() > 0)
> 
> No such line in ntp-dev-4.2.1p247-RC? ntpd/ntpd.c doesn't seem to
> contain any "full_recvbuffs". Lines 961 and next are:
> 

Sorry I was looking at an older version.

> | rbuf = get_full_recv_buffer();
> | while (rbuf != NULL)
> 

The loop is broken. It never requests another recvbuf after the first
one. Try replacing the loop with this:

		rbuf = get_full_recv_buffer();
		while (rbuf != NULL)
		{
			/*
			 * Call the data procedure to handle each received
			 * packet.
			 */
			if (rbuf != NULL)	/* This should always be true */
			{
				(rbuf->receiver)(rbuf);
				freerecvbuf(rbuf);
			} else {
				 msyslog(LOG_ERR, "receive buffer corruption - receiver found to be
NULL - ABORTING");
				 abort();
			}
			rbuf = get_full_recv_buffer();
		}

I'm not sure why it was unblocking io in the middle of this.

Danny

> I don't know if it's relevant, but a recursive grep finds
> "full_recvbufs" spelled both with one and two "f".
> 
> 
> Serge.

_______________________________________________
questions mailing list
questions@lists.ntp.isc.org
https://lists.ntp.isc.org/mailman/listinfo/questions

0
Reply mayer 5/8/2006 3:58:11 PM

 On Monday, May 8, 2006 at 15:58:11 +0000, Danny Mayer wrote:

> The loop is broken. It never requests another recvbuf after the first
> one. Try replacing the loop with this:

Much thanks, Danny: This corrected loop works perfectly without storm,
gives normal looking logs, and the daemon soon syncs. :-)


Serge.
-- 
Serge point Bets arobase laposte point net
0
Reply Serge 5/8/2006 11:33:44 PM

7 Replies
218 Views

(page loaded in 0.226 seconds)

Similiar Articles:









7/25/2012 2:01:53 AM


Reply: