f



timeout writing message to hostname: Broken pipe

I can't find more than one other person with the same error on Sendmail
8.13.6, and he doesn't come with a resolution or a closer definition of
the problem, so I guess I have to try and ask myself.

I have upgraded some Solaris 9 DMZ hosts from Sendmail 8.13.1 to
8.13.6. Since then, every once in a while a message arrives which
Sendmail can't deliver to the internal gateway hosts, with the Broken
pipe error as the result. Since I have a FallbackMX host defined on the
hosts in question, the mail at once gets sent to the Fallback, which in
turn sends it back to one of the original hosts, which in turn try to
send it onwards in and so on and so on.

Every once in a while, it seems like the receiving internal gateway
host actually receives the complete email and send it onwards, but the
DMZ host doesn't get this so it keeps on with it's looping and trying.
The result is not only a mail which cannot be delivered, but a mail
that get's duplicated. The internal gateway hosts originally ran
8.13.1, when I came upon this problem I upgraded them as well to 8.13.6
(they also run Solaris), but the problem persists.

So the 8.13.6 on Solaris DMZ hosts can't send to 8.13.1 or 8.13.6 on
Solaris internal gw, but it can send the same message to 8.13.1 on
RedHat Linux.

As I mentioned, this occurs for specific messages, if I pull the
message (qf and df file) off the queue, I can put it back later and get
the same problem. For every problem message, we deal with a couple of
100000 messages without any problem, so I can't see anything major
wrong in my configuration in general. Way down you can see the logs
from a failed attempt to send.

Anything, anyone?

Regards,

Per Brax


sendmail[13712]: [ID 801593 mail.info] k430ptn1013712:
from=<bounce-8660_HTML-8582298-10924-28559@bounce.thesender.com>,
size=33507, class=0, nrcpts=1,
msgid=<1146603969.16386@bounce.thesender.com>, proto=ESMTP,
daemon=MTA-v4, relay=mxhost4.ourcompany.com [10.98.8.19]
sendmail[13717]: [ID 801593 mail.crit] k430ptn1013712: SYSERR(root):
timeout writing message to incoming-gw1.ourcompany.com.: Broken pipe
sendmail[13717]: [ID 801593 mail.debug] k430ptn1013712:   0: fl=0x0,
mode=20666: CHR: dev=0/0, ino=235940, nlink=1, u/gid=0/3, size=0
sendmail[13717]: [ID 801593 mail.debug] k430ptn1013712:   1: fl=0x1,
mode=20666: CHR: dev=0/0, ino=235940, nlink=1, u/gid=0/3, size=0
sendmail[13717]: [ID 801593 mail.debug] k430ptn1013712:   2: fl=0x1,
mode=20666: CHR: dev=0/0, ino=235940, nlink=1, u/gid=0/3, size=0
sendmail[13717]: [ID 801593 mail.debug] k430ptn1013712:   3: fl=0x2000,
mode=150444: dev=0/0, ino=58, nlink=1, u/gid=0/0, size=0
sendmail[13717]: [ID 801593 mail.debug] k430ptn1013712:   4: fl=0x0,
mode=20000: CHR: dev=0/0, ino=38416, nlink=0, u/gid=0/0, size=0
sendmail[13717]: [ID 801593 mail.debug] k430ptn1013712:   5: fl=0x1,
mode=20666: CHR: dev=0/0, ino=235936, nlink=1, u/gid=0/3, size=0
sendmail[13717]: [ID 801593 mail.debug] k430ptn1013712:   6: fl=0x2,
mode=100600: dev=0/4, ino=50552, nlink=1, u/gid=64025/64025, size=2080
sendmail[13717]: [ID 801593 mail.debug] k430ptn1013712:   7: fl=0x2,
mode=140666: SOCK localhost->(Invalid argument)
sendmail[13717]: [ID 801593 mail.debug] k430ptn1013712:   8: fl=0x0,
mode=100600: dev=0/4, ino=50506, nlink=1, u/gid=64025/64025, size=32220
sendmail[13717]: [ID 801593 mail.debug] k430ptn1013712:   9: fl=0x2000,
mode=100640: dev=0/0, ino=242160, nlink=1, u/gid=0/64025, size=24576
sendmail[13717]: [ID 801593 mail.debug] k430ptn1013712:  10: fl=0x2000,
mode=100640: dev=0/0, ino=242160, nlink=1, u/gid=0/64025, size=24576
sendmail[13717]: [ID 801593 mail.debug] k430ptn1013712:  11: fl=0x2000,
mode=100640: dev=0/0, ino=242161, nlink=1, u/gid=0/64025, size=24576
sendmail[13717]: [ID 801593 mail.debug] k430ptn1013712:  12: fl=0x2000,
mode=100640: dev=0/0, ino=242161, nlink=1, u/gid=0/64025, size=24576
sendmail[13717]: [ID 801593 mail.debug] k430ptn1013712:  13: fl=0x82,
mode=140666: SOCK [IPv6:::]/0->(Transport endpoint is not connected)
sendmail[13717]: [ID 801593 mail.debug] k430ptn1013712:  14: fl=0x2000,
mode=100640: dev=0/0, ino=242156, nlink=1, u/gid=0/64025, size=24576
sendmail[13717]: [ID 801593 mail.debug] k430ptn1013712:  15: fl=0x2000,
mode=100640: dev=0/0, ino=242156, nlink=1, u/gid=0/64025, size=24576
sendmail[13717]: [ID 801593 mail.debug] k430ptn1013712:  16: fl=0x82,
mode=140666: SOCK [IPv6:::]/0->(Transport endpoint is not connected)
sendmail[13717]: [ID 801593 mail.debug] k430ptn1013712: MCI@0x0: NULL
sendmail[13717]: [ID 801593 mail.debug] k430ptn1013712: MCI@0x140f44:
flags=46086c<CACHED,ESMTP,SIZE,8BITMIME,DSN,ENHSTAT,PIPELINED,DLVR_BY>,
errno=32, herrno=1, exitstat=75, state=8, pid=0, maxsize=14000000,
phase=client DATA 354, mailer=esmtprec, status=4.4.2, rstatus=(null),
host=incoming-gw1.ourcompany.com., lastuse=Wed May  3 02:51:56 2006\n

0
perbrax (6)
5/24/2006 11:28:48 AM
comp.mail.sendmail 13518 articles. 1 followers. jfretby (35) is leader. Post Follow

12 Replies
3700 Views

Similar Articles

[PageSpeed] 9

perbrax@gmail.com writes:
> [...]
> As I mentioned, this occurs for specific messages, if I pull the
> message (qf and df file) off the queue, I can put it back later and get
> the same problem. For every problem message, we deal with a couple of
> 100000 messages without any problem, so I can't see anything major
> wrong in my configuration in general. Way down you can see the logs
> from a failed attempt to send.
>
> Anything, anyone?
> [...]

Have you tried to push such message in verbose mode?
[ using -qI... to select the message ]

sendmail -v -qI...

Full trascript of (E)SMTP session may provide more hints.

-- 
[pl2en: Andrew] Andrzej Adam Filip : anfi@priv.onet.pl : anfi@xl.wp.pl
http://anfi.homeunix.net/sendmail/   http://www.linkedin.com/in/andfil  
Before You Ask: http://anfi.homeunix.net/sendmail/B4UAsk-Sendmail.html
0
anfi (2014)
5/24/2006 12:34:11 PM
Hi Andrzej,

Yeah, I did. Didn't want to make too big a post without anyone showing
any interest.
Apart from that, it doesn't say much more. Here we go, anonymized as
the previous one I hope:

Thanks for showing an interest.

Regards,

Per

# sendmail -qIBrax -v

Running /var/spool/mqinet/kBrax100000000 (sequence 1 of 1)
<Per.Brax@ourcompany.com>... Connecting to incoming-gw1.ourcompany.com.
via esmtprec...
220 incoming-gw1.ourcompany.com ESMTP Sendmail 8.13.6/8.13.1; Wed, 24
May 2006 10:21:37 +0200 (MEST)
>>> EHLO dmzhost1.ourcompany.com
250-incoming-gw1.ourcompany.com Hello dmzhost1.ourcompany.com
[10.83.33.90], pleased to meet you
250-ENHANCEDSTATUSCODES
250-PIPELINING
250-8BITMIME
250-SIZE 14000000
250-DSN
250-ETRN
250-DELIVERBY
250 HELP
>>> MAIL From:<per@nospam.brax.org> SIZE=87750
250 2.1.0 <per@nospam.brax.org>... Sender ok
>>> RCPT To:<Per.Brax@ourcompany.com>
250 2.1.5 <Per.Brax@ourcompany.com>... Recipient ok
>>> DATA
354 Enter mail, end with "." on a line by itself
timeout writing message to selumgw1.lu.se.tetrapak.com.: Broken pipe
<Per.Brax@ourcompany.com>... Connecting to incoming-gw0.ourcompany.com.
via esmtprec...
<Per.Brax@ourcompany.com>... Closing connection to
incoming-gw1.ourcompany.com.

0
perbrax (6)
5/24/2006 12:59:00 PM
"Per Brax" <perbrax@gmail.com> writes:

> Hi Andrzej,
>
> Yeah, I did. Didn't want to make too big a post without anyone showing
> any interest.
> Apart from that, it doesn't say much more. Here we go, anonymized as
> the previous one I hope:
>
> Thanks for showing an interest.
>
> Regards,
>
> Per
>
> # sendmail -qIBrax -v
>
> Running /var/spool/mqinet/kBrax100000000 (sequence 1 of 1)
> <Per.Brax@ourcompany.com>... Connecting to incoming-gw1.ourcompany.com.
> via esmtprec...
> 220 incoming-gw1.ourcompany.com ESMTP Sendmail 8.13.6/8.13.1; Wed, 24
> May 2006 10:21:37 +0200 (MEST)
>>>> EHLO dmzhost1.ourcompany.com
> 250-incoming-gw1.ourcompany.com Hello dmzhost1.ourcompany.com
> [10.83.33.90], pleased to meet you
> 250-ENHANCEDSTATUSCODES
> 250-PIPELINING
> 250-8BITMIME
> 250-SIZE 14000000
> 250-DSN
> 250-ETRN
> 250-DELIVERBY
> 250 HELP
>>>> MAIL From:<per@nospam.brax.org> SIZE=87750
> 250 2.1.0 <per@nospam.brax.org>... Sender ok
>>>> RCPT To:<Per.Brax@ourcompany.com>
> 250 2.1.5 <Per.Brax@ourcompany.com>... Recipient ok
>>>> DATA
> 354 Enter mail, end with "." on a line by itself
> timeout writing message to selumgw1.lu.se.tetrapak.com.: Broken pipe
> <Per.Brax@ourcompany.com>... Connecting to incoming-gw0.ourcompany.com.
> via esmtprec...
> <Per.Brax@ourcompany.com>... Closing connection to
> incoming-gw1.ourcompany.com.

I have no idea what it may be but it may help to exclude MTU related
problems described at http://www.sendmail.org/faq/section3.html#3.10

Below please find recommended test procedure (it allows to exclude
"parallel SMTP connections to target host from tracking).

0) copy sendmail.cf to sendmail-test.cf 
1) in sendmail-test.cf change ClientPortOptions to contain Port
specification (comma separated list) e.g.
O ClientPortOptions=Port=1025

It will ease tracking the SMTP connection at TCP/IP packets level
2) use tcpdump (or similar tool) to track packets with local port as set
in point 1
3) Push the messages in verbose mode
sendmail -C /etc/mail/sendmail-test.cf -qI... -v

-- 
[pl2en: Andrew] Andrzej Adam Filip : anfi@priv.onet.pl : anfi@xl.wp.pl
http://anfi.homeunix.net/sendmail/   http://www.linkedin.com/in/andfil  
Before You Ask: http://anfi.homeunix.net/sendmail/B4UAsk-Sendmail.html
0
anfi (2014)
5/25/2006 9:13:00 AM
Andrzej Adam Filip wrote:

> > timeout writing message to selumgw1.lu.se.tetrapak.com.: Broken pipe
> > <Per.Brax@ourcompany.com>... Connecting to incoming-gw0.ourcompany.com.
> > via esmtprec...
> > <Per.Brax@ourcompany.com>... Closing connection to
> > incoming-gw1.ourcompany.com.
>
> I have no idea what it may be but it may help to exclude MTU related
> problems described at http://www.sendmail.org/faq/section3.html#3.10
>
> Below please find recommended test procedure (it allows to exclude
> "parallel SMTP connections to target host from tracking).
>
> 0) copy sendmail.cf to sendmail-test.cf
> 1) in sendmail-test.cf change ClientPortOptions to contain Port
> specification (comma separated list) e.g.
> O ClientPortOptions=Port=1025
>
> It will ease tracking the SMTP connection at TCP/IP packets level
> 2) use tcpdump (or similar tool) to track packets with local port as set
> in point 1
> 3) Push the messages in verbose mode
> sendmail -C /etc/mail/sendmail-test.cf -qI... -v

I saw this as the last resort since there is a firewall inbetween and
also; before the upgrade I ran 8.13.1 flawlessly since it's release (or
at least since I made a Solaris package of it), so MTU problems seem a
bit far fetched. But you couldn't have known that of course, since I
didn't mention it.

But I guess I will have to do a snoop (tcpdump on Solaris) if I want to
find out what's going on, and changing the client port does of course
make my life (volumewise in information caught) much easier.

Thanks for your input. 

Regards,

Per Brax

0
perbrax (6)
5/25/2006 8:55:10 PM
"Per Brax" <perbrax@gmail.com> writes:
> [...]
> I saw this as the last resort since there is a firewall inbetween and
> also; before the upgrade I ran 8.13.1 flawlessly since it's release (or
> at least since I made a Solaris package of it), so MTU problems seem a
> bit far fetched. But you couldn't have known that of course, since I
> didn't mention it.
>
> But I guess I will have to do a snoop (tcpdump on Solaris) if I want to
> find out what's going on, and changing the client port does of course
> make my life (volumewise in information caught) much easier.
>
> Thanks for your input. 

You may check log files on the *receiving* host too before more laborious
checks. Chance that it will give a valuable hint in *this case* are low
but the check is simple.

-- 
[pl2en: Andrew] Andrzej Adam Filip : anfi@priv.onet.pl : anfi@xl.wp.pl
http://anfi.homeunix.net/sendmail/   http://www.linkedin.com/in/andfil  
Before You Ask: http://anfi.homeunix.net/sendmail/B4UAsk-Sendmail.html
0
anfi (2014)
5/25/2006 9:03:41 PM
Andrzej Adam Filip wrote:

> Below please find recommended test procedure (it allows to exclude
> "parallel SMTP connections to target host from tracking).
>
> 0) copy sendmail.cf to sendmail-test.cf
> 1) in sendmail-test.cf change ClientPortOptions to contain Port
> specification (comma separated list) e.g.
> O ClientPortOptions=Port=1025
>
> It will ease tracking the SMTP connection at TCP/IP packets level
> 2) use tcpdump (or similar tool) to track packets with local port as set
> in point 1
> 3) Push the messages in verbose mode
> sendmail -C /etc/mail/sendmail-test.cf -qI... -v

I have now tried it the hard way, using a sendmail.cf which both
originates at port 1025 and connects to a remote with port 1025. I used
tcpdump to stick to a tool which is more generally used. I must say
that I don't feel much closer to a solution. Here are my test results :

Sendmail commandline and result :

# sendmail -v -q -C/usr/local/etc/mail/sendmail-test.cf

Running /var/spool/mqtest/kBrax100000000 (sequence 1 of 1)
<Per.Brax@ourcompany.com>... Connecting to incoming-gw1.ourcompany.com.
port 1025 via esmtprec...
220 incoming-gw1.ourcompany.com ESMTP Sendmail 8.13.6/8.13.1; Thu, 1
Jun 2006 11:13:02 +0200 (CEST)
>>> EHLO mxhost4.ourcompany.com
250-incoming-gw1..ourcompany.com Hello mxhost4.ourcompany.com
[192.168.33.90], pleased to meet you
250-ENHANCEDSTATUSCODES
250-PIPELINING
250-8BITMIME
250-SIZE 14000000
250-DSN
250-ETRN
250-DELIVERBY
250 HELP
>>> MAIL From:<per@brax.obfuscated> SIZE=87750
250 2.1.0 <per@brax.obfuscated>... Sender ok
>>> RCPT To:<Per.Brax@ourcompany.com>
250 2.1.5 <Per.Brax@ourcompany.com>... Recipient ok
>>> DATA
354 Enter mail, end with "." on a line by itself
timeout writing message to incoming-gw1.ourcompany.com.: Broken pipe
<Per.Brax@ourcompany.com>... Connecting to incoming-gw3.ourcompany.com.
port 1025 via esmtprec...
<Per.Brax@ourcompany.com>... Closing connection to
incoming-gw1.ourcompany.com.

Log entries at receiving host, note that size matches what you will
later see in the tcpdump, not size of the actual file in mailqueue,
which is double this size:

May 31 11:35:39 inc-gw1 sendmail[25971]: [ID 801593 mail.crit]
k4V8ZdZQ025971: SYSERR(root): collect: read timeout on connection from
mxhost4.ourcompany.com, from=<per@brax.obfuscated>
May 31 11:35:39 inc-gw1 sendmail[25971]: [ID 801593 mail.info]
k4V8ZdZQ025971: from=<per@brax.obfuscated>, size=39941, class=0,
nrcpts=1, proto=ESMTP, daemon=Pri-MTA, relay=mxhost4.ourcompany.com
[192.168.33.90]

Tcpdump -vvv output for the above email, only the last part, since that
feels most relevant:

14:59:33.690220 IP (tos 0x0, ttl  64, id 19778, offset 0, flags [DF],
proto: TCP (6), length: 1420) mxhost4.1025 > incoming-gw1.1025: P,
cksum 0xc780 (incorrect (-> 0x591d), 34251:35631(1380) ack 461 win
49360
14:59:33.690238 IP (tos 0x0, ttl  64, id 19779, offset 0, flags [DF],
proto: TCP (6), length: 1420) mxhost4.1025 > incoming-gw1.1025: .,
cksum 0xc780 (incorrect (-> 0x112e), 35631:37011(1380) ack 461 win
49360
14:59:33.690347 IP (tos 0x0, ttl  64, id 19780, offset 0, flags [DF],
proto: TCP (6), length: 1420) mxhost4.1025 > incoming-gw1.1025: P,cksum
0xc780 (incorrect (-> 0xe596), 37011:38391(1380) ack 461 win 49360
14:59:33.690367 IP (tos 0x0, ttl  64, id 19781, offset 0, flags [DF],
proto: TCP (6), length: 1420) mxhost4.1025 > incoming-gw1.1025: .,
cksum 0xc780 (incorrect (-> 0x67e3), 38391:39771(1380) ack 461 win
49360
14:59:33.690387 IP (tos 0x0, ttl  64, id 19782, offset 0, flags [DF],
proto: TCP (6), length: 1332) mxhost4.1025 > incoming-gw1.1025: P,cksum
0xc728 (incorrect (-> 0xcb62), 39771:41063(1292) ack 461 win 49360

14:59:33.691790 IP (tos 0x0, ttl  63, id 5394, offset 0, flags [DF],
proto: TCP (6), length: 40) incoming-gw1.1025 > mxhost4.1025: ., cksum
0x8317 (correct), 461:461(0) ack 35631 win 49680
14:59:33.691845 IP (tos 0x0, ttl  63, id 5395, offset 0, flags [DF],
proto: TCP (6), length: 40) incoming-gw1.1025 > mxhost4.1025: ., cksum
0x784f (correct), 461:461(0) ack 38391 win 49680
14:59:33.691889 IP (tos 0x0, ttl  63, id 5396, offset 0, flags [DF],
proto: TCP (6), length: 40) incoming-gw1.1025 > mxhost4.1025: ., cksum
0x6ddf (correct), 461:461(0) ack 41063 win 49680
14:59:33.692589 IP (tos 0x1,ECT(1), ttl  63, id 7437, offset 0, flags
[none], proto: TCP (6), length: 40) incoming-gw1.1025 > mxhost4.1025:
R, cksum 0x94ff (correct), 2473851200:2473851200(0) win 29141

As far as I can see, the sender is seemingly OK at this level. However,
the receiver responds with an RST. One should keep in mind that this
occurs when a Sendmail 8.13.6 talks to a 8.13.6, both on Solaris. When
the receiving end runs 8.13.1 on Linux, it works fine. If the receiver
runs 8.13.1 on Solaris, it breaks the same way. If the sender runs
8.13.1 on Linux and the receiver runs 8.13.6 on Solaris, it works fine.
I'm sorry that the log excerpt does not come from the same queue run as
the tcpdump. However, the sender always aborts at the same byte
(39941). It's in the middle of a PDF file, for what it's worth.

I'm really stuck here!

Regards,
Per

0
perbrax (6)
6/1/2006 11:06:47 AM
Per Brax wrote:
> Andrzej Adam Filip wrote:
> 
>> Below please find recommended test procedure (it allows to exclude
>> "parallel SMTP connections to target host from tracking).
>>
>> 0) copy sendmail.cf to sendmail-test.cf
>> 1) in sendmail-test.cf change ClientPortOptions to contain Port
>> specification (comma separated list) e.g.
>> O ClientPortOptions=Port=1025
>>
>> It will ease tracking the SMTP connection at TCP/IP packets level
>> 2) use tcpdump (or similar tool) to track packets with local port as set
>> in point 1
>> 3) Push the messages in verbose mode
>> sendmail -C /etc/mail/sendmail-test.cf -qI... -v
> 
> I have now tried it the hard way, using a sendmail.cf which both
> originates at port 1025 and connects to a remote with port 1025. I used
> tcpdump to stick to a tool which is more generally used. I must say
> that I don't feel much closer to a solution. Here are my test results :
> 
> Sendmail commandline and result :
> 
> # sendmail -v -q -C/usr/local/etc/mail/sendmail-test.cf
> 
> Running /var/spool/mqtest/kBrax100000000 (sequence 1 of 1)
> <Per.Brax@ourcompany.com>... Connecting to incoming-gw1.ourcompany.com.
> port 1025 via esmtprec...
> 220 incoming-gw1.ourcompany.com ESMTP Sendmail 8.13.6/8.13.1; Thu, 1
> Jun 2006 11:13:02 +0200 (CEST)
>>>> EHLO mxhost4.ourcompany.com
> 250-incoming-gw1..ourcompany.com Hello mxhost4.ourcompany.com
> [192.168.33.90], pleased to meet you
> 250-ENHANCEDSTATUSCODES
> 250-PIPELINING
> 250-8BITMIME
> 250-SIZE 14000000
> 250-DSN
> 250-ETRN
> 250-DELIVERBY
> 250 HELP
>>>> MAIL From:<per@brax.obfuscated> SIZE=87750
> 250 2.1.0 <per@brax.obfuscated>... Sender ok
>>>> RCPT To:<Per.Brax@ourcompany.com>
> 250 2.1.5 <Per.Brax@ourcompany.com>... Recipient ok
>>>> DATA
> 354 Enter mail, end with "." on a line by itself
> timeout writing message to incoming-gw1.ourcompany.com.: Broken pipe
> <Per.Brax@ourcompany.com>... Connecting to incoming-gw3.ourcompany.com.
> port 1025 via esmtprec...
> <Per.Brax@ourcompany.com>... Closing connection to
> incoming-gw1.ourcompany.com.
> 
> Log entries at receiving host, note that size matches what you will
> later see in the tcpdump, not size of the actual file in mailqueue,
> which is double this size:
> 
> May 31 11:35:39 inc-gw1 sendmail[25971]: [ID 801593 mail.crit]
> k4V8ZdZQ025971: SYSERR(root): collect: read timeout on connection from
> mxhost4.ourcompany.com, from=<per@brax.obfuscated>
> May 31 11:35:39 inc-gw1 sendmail[25971]: [ID 801593 mail.info]
> k4V8ZdZQ025971: from=<per@brax.obfuscated>, size=39941, class=0,
> nrcpts=1, proto=ESMTP, daemon=Pri-MTA, relay=mxhost4.ourcompany.com
> [192.168.33.90]
> 
> Tcpdump -vvv output for the above email, only the last part, since that
> feels most relevant:
> 
> 14:59:33.690220 IP (tos 0x0, ttl  64, id 19778, offset 0, flags [DF],
> proto: TCP (6), length: 1420) mxhost4.1025 > incoming-gw1.1025: P,
> cksum 0xc780 (incorrect (-> 0x591d), 34251:35631(1380) ack 461 win

---------------^^^^^^^^^^^
> 49360
> 14:59:33.690238 IP (tos 0x0, ttl  64, id 19779, offset 0, flags [DF],
> proto: TCP (6), length: 1420) mxhost4.1025 > incoming-gw1.1025: .,
> cksum 0xc780 (incorrect (-> 0x112e), 35631:37011(1380) ack 461 win

---------------^^^^^^^^^^
.....

> 
> As far as I can see, the sender is seemingly OK at this level. However,
> the receiver responds with an RST. One should keep in mind that this
> occurs when a Sendmail 8.13.6 talks to a 8.13.6, both on Solaris. When
> the receiving end runs 8.13.1 on Linux, it works fine. 

On the same Hardware?

If the receiver
> runs 8.13.1 on Solaris, it breaks the same way. If the sender runs
> 8.13.1 on Linux and the receiver runs 8.13.6 on Solaris, it works fine.
> I'm sorry that the log excerpt does not come from the same queue run as
> the tcpdump. However, the sender always aborts at the same byte
> (39941). It's in the middle of a PDF file, for what it's worth.

cheers

Erich

0
erich.titl (120)
6/1/2006 11:46:01 AM
mega wrote:

>
> >
> > As far as I can see, the sender is seemingly OK at this level. However,
> > the receiver responds with an RST. One should keep in mind that this
> > occurs when a Sendmail 8.13.6 talks to a 8.13.6, both on Solaris. When
> > the receiving end runs 8.13.1 on Linux, it works fine.
>
> On the same Hardware?

No, I forgot that part, probably because I always run Solaris on Sparc.
Having said that, I always (so far) run Linux on i386. So there is a
fundmental architectural difference, also in MSB order. But hey, we're
talking about a text based
protocol here! Which of course doesn't exclude bugs specific for one
OS.

I should perhaps also add that 8.13.1 on Linux and 8.13.1 on Solaris
(same hardware as now, i386 and Sparc) worked flawlessly to my
knowledge for more than a year before doing the upgrade.
>
> If the receiver
> > runs 8.13.1 on Solaris, it breaks the same way. If the sender runs
> > 8.13.1 on Linux and the receiver runs 8.13.6 on Solaris, it works fine.
> > I'm sorry that the log excerpt does not come from the same queue run as
> > the tcpdump. However, the sender always aborts at the same byte
> > (39941). It's in the middle of a PDF file, for what it's worth.
> 
> cheers
> 
> Erich

Thanks for giving it a thought!

Per

0
perbrax (6)
6/1/2006 12:36:25 PM
Per Brax wrote:
> mega wrote:
> 
>>> As far as I can see, the sender is seemingly OK at this level. However,
>>> the receiver responds with an RST. One should keep in mind that this
>>> occurs when a Sendmail 8.13.6 talks to a 8.13.6, both on Solaris. When
>>> the receiving end runs 8.13.1 on Linux, it works fine.
>> On the same Hardware?
> 
> No, I forgot that part, probably because I always run Solaris on Sparc.
> Having said that, I always (so far) run Linux on i386. So there is a
> fundmental architectural difference, also in MSB order. But hey, we're
> talking about a text based
> protocol here! 

Yes we are, still the incorrect crc puzzles me...

Erich
0
erich.titl (120)
6/1/2006 3:06:12 PM
mega wrote:

> > No, I forgot that part, probably because I always run Solaris on Sparc.
> > Having said that, I always (so far) run Linux on i386. So there is a
> > fundmental architectural difference, also in MSB order. But hey, we're
> > talking about a text based
> > protocol here!
>
> Yes we are, still the incorrect crc puzzles me...

I was thinking that the checksum will be fixed on the way out of the
box
(this is captured on the originating host). I'm not used to capturing
traffic originating
at my server, so I'm not used to it either. I think you will find it
happens on outbound only

Regards,

Per

0
perbrax (6)
6/1/2006 3:34:49 PM
Per Brax wrote:
> As far as I can see, the sender is seemingly OK at this level. However,
> the receiver responds with an RST. One should keep in mind that this
> occurs when a Sendmail 8.13.6 talks to a 8.13.6, both on Solaris. When
> the receiving end runs 8.13.1 on Linux, it works fine. If the receiver
> runs 8.13.1 on Solaris, it breaks the same way. If the sender runs
> 8.13.1 on Linux and the receiver runs 8.13.6 on Solaris, it works fine.
> I'm sorry that the log excerpt does not come from the same queue run as
> the tcpdump. However, the sender always aborts at the same byte
> (39941). It's in the middle of a PDF file, for what it's worth.

A few thoughts come to mind.  First, it really feels like a TCP/OS type
issue.  The problem is sendmail from a Solaris system can not talk to
sendmail on a Linux system, but the reverse works.  The tcpdump trace
shows a CRC checksum error.  This is a packet level error.  Sendmail
simply writes the data to socket, the kernel takes care if stuffing the
data into a series of TCP packets.  This leads me to think of the OS.

It might be a sendmail issue.  One thought, have you looked at what
byte
39941 is?  Could it be a dot on a line by itself?  This might cause the
sender's sendmail to interpret this as an end of message indicator,
so it stops sending thinking the message is over.  But the receiving
sendmail does not get the trailing dot, so it waits for the message to
be finished.  This is grasping at straws.

A second idea is that it might be related to 8 bit data.  You might try
forcing it to use 8 -> 7 MIME conversion.  You can do this with the
option "EightBitMode" (old "8" option).  Try setting it to "m" which
will convert undeclared 8 bit data to 8 bit MIME format.  You can do
this on the command line with either:
      sendmail -o8m -qI...
or    sendmail -OEightBitMode=m -qI...
If this is the problem, you can change it in your sendmail.mc file
with:
      define(`confEIGHT_BIT_HANDLING',`m')

A variation on this idea is that you are probably using the SMTP relay
mailer for delivery which uses the "justsend8" mailer flag.  This may
be
allowing unlabeled 8 bit data through.  You can determine the mailer
with:
     sendmail -bv Per.Brax@ourcompany.com
If this is the case, then you might try changing the mailer to "esmtp".
If you are using the mailertable, you can do in there.  Otherwise you
will need to generate a test sendmail.cf file because the mailer name
is expanded from an M4 macro.  Add this to a test.mc file:
        define(`confRELAY_MAILER',`esmtp')
Check that the last rule in the MailerToTriple ruleset calls the esmtp
mail, not the relay mailer.

Hope this helps

RLH

For info about our "Managing Internet Mail, Setting Up and Trouble
Shooting sendmail and DNS" and a schedule of dates and locations,
please send email to info@harker.com, or visit www.harker.com

Robert Harker                                   Harker Systems
Sendmail and TCP/IP Network Training            4182 Pleasant Hill Rd.
Sendmail, Network, and Sysadmin Consulting      Lincoln, CA 95648
harker@harker.com                               530-887-9990

0
harker (113)
6/1/2006 5:20:52 PM
In article <1149160006.969676.60170@g10g2000cwb.googlegroups.com> "Per
Brax" <perbrax@gmail.com> writes:
>
>Tcpdump -vvv output for the above email, only the last part, since that
>feels most relevant:
>
>14:59:33.690220 IP (tos 0x0, ttl  64, id 19778, offset 0, flags [DF],
>proto: TCP (6), length: 1420) mxhost4.1025 > incoming-gw1.1025: P,
>cksum 0xc780 (incorrect (-> 0x591d), 34251:35631(1380) ack 461 win
>49360
>14:59:33.690238 IP (tos 0x0, ttl  64, id 19779, offset 0, flags [DF],
>proto: TCP (6), length: 1420) mxhost4.1025 > incoming-gw1.1025: .,
>cksum 0xc780 (incorrect (-> 0x112e), 35631:37011(1380) ack 461 win
>49360
>14:59:33.690347 IP (tos 0x0, ttl  64, id 19780, offset 0, flags [DF],
>proto: TCP (6), length: 1420) mxhost4.1025 > incoming-gw1.1025: P,cksum
>0xc780 (incorrect (-> 0xe596), 37011:38391(1380) ack 461 win 49360
>14:59:33.690367 IP (tos 0x0, ttl  64, id 19781, offset 0, flags [DF],
>proto: TCP (6), length: 1420) mxhost4.1025 > incoming-gw1.1025: .,
>cksum 0xc780 (incorrect (-> 0x67e3), 38391:39771(1380) ack 461 win
>49360
>14:59:33.690387 IP (tos 0x0, ttl  64, id 19782, offset 0, flags [DF],
>proto: TCP (6), length: 1332) mxhost4.1025 > incoming-gw1.1025: P,cksum
>0xc728 (incorrect (-> 0xcb62), 39771:41063(1292) ack 461 win 49360
>
>14:59:33.691790 IP (tos 0x0, ttl  63, id 5394, offset 0, flags [DF],
>proto: TCP (6), length: 40) incoming-gw1.1025 > mxhost4.1025: ., cksum
>0x8317 (correct), 461:461(0) ack 35631 win 49680
>14:59:33.691845 IP (tos 0x0, ttl  63, id 5395, offset 0, flags [DF],
>proto: TCP (6), length: 40) incoming-gw1.1025 > mxhost4.1025: ., cksum
>0x784f (correct), 461:461(0) ack 38391 win 49680
>14:59:33.691889 IP (tos 0x0, ttl  63, id 5396, offset 0, flags [DF],
>proto: TCP (6), length: 40) incoming-gw1.1025 > mxhost4.1025: ., cksum
>0x6ddf (correct), 461:461(0) ack 41063 win 49680
>14:59:33.692589 IP (tos 0x1,ECT(1), ttl  63, id 7437, offset 0, flags
>[none], proto: TCP (6), length: 40) incoming-gw1.1025 > mxhost4.1025:
>R, cksum 0x94ff (correct), 2473851200:2473851200(0) win 29141
>
>As far as I can see, the sender is seemingly OK at this level. However,
>the receiver responds with an RST.

Well, it's surely not a sendmail problem, since a user-level program
basically can't produce such a behaviour from the TCP/IP stack (or at
least it would have to try *really* hard:-). That doesn't rule out the
possibility that changed sendmail behaviour may be more likely to
trigger the problem though.

My first question would be whether there is some firewall involved
(separate box or local to either host)? I've seen something similar when
using ipfilter in overly paranoid mode - it was set up to send RST in
response to TCP packets that weren't allowed by one of the "holes" or
matched an existing session via kept state, but this rule triggered also
for packets that *did* belong to an existing session but were outside
(ipfilter's idea of) the TCP window.

Other things that strike me as odd in the trace are the short packet
size (1420 rather than the normal 1500 on Ethernet), and the fact that
the last packet sent is shorter still (1332). Is there some tunneling,
weird stuff like PPPoE, or manual setting of MTU or MSS at either or
both ends? The extra-short packet would seem to imply that it was
actually the final DATA packet, but that can't be right if the SIZE
is. It could perhaps also be due to the sender thinking that the window
didn't allow for sending more - that seems unlikely given the following
ACKs, but only looking at the last preceding ACK would say for sure.

> However, the sender always aborts at the same byte
>(39941). It's in the middle of a PDF file, for what it's worth.

Surely you aren't trying to send a "raw" PDF file, which is arbitrary
binary data that can't be transferred via standard SMTP (still wouldn't
explain the trace though) - it's base64-encoded, right?

--Per Hedeland
per@hedeland.org
0
per71 (2635)
6/2/2006 12:16:46 AM
Reply: