Server silently fails on messages with a huge To: header; any ideas?

Our incoming relay (sendmail) occasionally receives messages which were sent to many recipients
(sometimes it's spam, sometimes valid maillists to which our users have subscribed). The messages
in question have a To: header which is typically over 6kb in size and over 80 lines long (and since
several recipients with short names/addresses may be grouped on one line, there's about a hundred
recipients listed).
It fails trying to relay these messages to our backend Sun Messaging Server (6.3-6.0.3 x64), and it
fails silently. I am not definitely sure that this is SMS's flaw and not Sendmails; but perhaps someone
can shed light on the matter? :)
SMS's mail.log_current receives such entries (here xxx.xxx.xxx.100 is the relay, xxx.xxx.xxx.73
is the backend server):
04-Dec-2008 16:54:44.62 tcp_local    +            O TCP|xxx.xxx.xxx.73|25|xxx.xxx.xxx.100|33728 SMTP
04-Dec-2008 16:59:44.62 tcp_intranet ims-ms       VE 0 [email protected] rfc822;[email protected] ouruser@ims-ms-daemon relay.domain.ru ([xxx.xxx.xxx.100]) '' Timeout after 5 minutes trying to read SMTP packet
04-Dec-2008 16:59:44.62 tcp_local    +            C TCP|xxx.xxx.xxx.73|25|xxx.xxx.xxx.100|33728 SMTP Timeout
after 5 minutes trying to read SMTP packetSendmail logs a broken connection:
Dec 4 17:01:27 relay sendmail[14689]: [ID 801593 mail.crit] mB47gCN4014672: SYSERR(root): timeout writing message to sunmail.domain.ru.: Broken pipe
Dec 4 17:01:27 relay sendmail[14689]: [ID 801593 mail.info] mB47gCN4014672: to=<[email protected]>, delay=00:07:01, xdelay=00:06:58, mailer=esmtp, pri=329059, relay=sunmail.domain.ru. [xxx.xxx.xxx.73], dsn=4.0.0, stat=DeferredSniffing the wire gives strange results: The SMTP dialog part seems okay, the message is submitted
(relayed) only for our local user's address. But the message is not transferred until sendmail dies.
When the sendmail process dies (due to timeout or by a manual kill), about 3 packets appear in the
sniffer's output, starting with the usual "Received: from" lines and other header parts. The last packet
has text from the middle of the To: header, often breaking mid-word. Perhaps it's some buffering error
in either the sending Sendmail or the receiving Sunmail, or some server TCP-networking/sniffer glitch.
If I manually edit the queue file (/var/spool/mqueue/qfmB47gCN4014672 for the sample above) and delete
most of the To: header's lines, the message goes through okay.
This just does not seem logical - the message header text seems to be compliant (that is, each single
line is short, although all sub-lines of To: concatenate to a rather large text; but not that extremely large).
Neither sendmail nor sun mail report any error except networking socket failure.
MTUs are the same on both servers (1500), and any other large message (i.e. with attachments),
relays okay.
Are there any known issues on Sun Messaging Server (or Sendmail for that matter) which look like
this and ring a bell to a casual reader? :) Perhaps Sieve filters, etc.?
Since sendmail does successfully receive this message from the internet, and none of our several
incoming milters break along the way, I don't think it should have a huge problem forwarding it to
another server (I'll try experimenting though). This is why I think it's possible that Sun mail may be
at fault.
# imsimta version
Sun Java(tm) System Messaging Server 6.3-6.03 (built Mar 14 2008; 64bit)
libimta.so 6.3-6.03 (built 17:15:08, Mar 14 2008; 64bit)
SunOS sunmail 5.10 Generic_127112-07 i86pc i386 i86pc

Hello all, thanks for your suggestions.
In short, I debugged with Shane's suggestions. Apparently, tcp_smtp_server didn't get
a byte for 5 minutes so the read() was locked. At least, there's no specific failing routine
in Sunmail, so I'm back to research about Sendmail and networking, buffering and so on.
As I mentioned, when relay's sendmail process is killed, the system spits out about 3
packets of header data to the network...
Details follow...
By "silently failing" i meant that no obvious SMTP error is issued. The connection hangs
until it's aborted and both servers only complain on that - a failed network connection.
The resulting problem is that the sendmail relay marks sunmail as "Deferring connections"
in its hoststatus table, and valid messages are not even attempted for submission. At the
moment we fixed that brutally but effectively - by removing the hoststatus file for our sunmail
via cron every minute.
Concerning Mark's post, these servers are in the same DMZ, on a Cisco 2960G switch
which caused no specific problems. I mentioned MTU's are the same and standard,
because a few weeks back we did have LDAP replication problems due to experiments
with Jumbo frames, but solved them internally (I posted on this in the DSEE forum, also
asking how to compare LDAPs: [http://forums.sun.com/thread.jspa?threadID=5349017]).
We use this tandem of relay-backend servers for half a year now (and before we deployed
Sun Messaging Server, this sendmail relayed mails to our old server for many years).
So far this (large To:) is the only type of messages I see that cause such behavior; for
any other large mails the size does not matter, or at least some rejection explanation
is generated by one of the SMTP engines.
Shane, thanks for your help over and over ;)
I tried enabling the options you mentioned, ran "imsimta cnbuild" and reloaded the services.
Then I fired up the sniffer on the relay server, "tail -f mail.log_current" on the sunmail, and
submitted a "bad message" from the Sendmail queue.
In the sniffer the SMTP dialog went ok until submission of message data, where it hung as
before:
# ngrep "" tcp port 25 and host sunmail
T xxx.xxx.xxx.73:25 -> xxx.xxx.xxx.100:53200 [AP]
220 sunmail.domain.ru -- Server ESMTP (Sun Java(tm) System Messaging Server 6.
3-6.03 (built Mar 14 2008; 64bit))..
T xxx.xxx.xxx.100:53200 -> xxx.xxx.xxx.73:25 [AP]
EHLO relay.domain.ru..
T xxx.xxx.xxx.73:25 -> xxx.xxx.xxx.100:53200 [AP]
250-sunmail.domain.ru..250-8BITMIME..250-PIPELINING..250-CHUNKING..250-DSN..25
0-ENHANCEDSTATUSCODES..250-EXPN..250-HELP..250-XADR..250-XSTA..250-XCIR..25
0-XGEN..250-XLOOP 4A70E733A15FFE33EF3564BD522B1348..250-STARTTLS..250-ETRN.
.250-NO-SOLICITING..250 SIZE 20992000..
T xxx.xxx.xxx.100:53200 -> xxx.xxx.xxx.73:25 [AP]
MAIL From:<[email protected]> SIZE=200312..
T xxx.xxx.xxx.73:25 -> xxx.xxx.xxx.100:53200 [AP]
250 2.5.0 Address and options OK...
T xxx.xxx.xxx.100:53200 -> xxx.xxx.xxx.73:25 [AP]
RCPT To:<[email protected]> NOTIFY=SUCCESS,FAILURE,DELAY..DATA..
T xxx.xxx.xxx.73:25 -> xxx.xxx.xxx.100:53200 [AP]
250 2.1.5 [email protected] and options OK...
T xxx.xxx.xxx.73:25 -> xxx.xxx.xxx.100:53200 [AP]
354 Enter mail, end with a single "."...
#In the mail.log_current just one line appeared:
05-Dec-2008 10:51:18.46 tcp_local    +            O TCP|xxx.xxx.xxx.73|25|xxx.xxx.xxx.100|53200 SMTPSince it also mentions tcp_local channel, I decided to enable slave_debug on that as well.
Rebuilt the configs, and ran msg-stop to see if the processes actually die. When I checked
the "netstat -an | grep -w 25" and "ps -ef" outputs, there was indeed a tcp_smtp_server
process running:
mailsrv 23594   656   0 10:50:08 ?           0:00 /opt/SUNWmsgsr/messaging64/lib/tcp_smtp_serverBoth the sunmail and sendmail relay kept the socket ESTABLISHED. I took a pstack
of the tcp_smtp_server (below) and killed it with SIGSEGV so I have a core dump if
needed. Then I started the services and submitted the message from the queue again.
The SMTP dialog log was actually from tcp_local, and it ended with the lines like these
(note that even in this detailed log it just died with "network read failed" after 5 minutes,
I inserted an empty line to make it more visible):
11:21:18.26: Good address count 1 defer count 0
11:21:18.26: Copy estimate after address addition is 2
11:21:18.26: mmc_rrply: Return detailed status information.
11:21:18.26: mmc_rrply: Returning
11:21:18.26: Sending    : "250 2.1.5 [email protected] and options OK."
11:21:18.26: Received   : "DATA"
11:21:18.26: mmc_waend(0x00749cc0) called.
11:21:18.26:   Copy estimate is 2
11:21:18.26:   Queue area size 35152252, temp area size 2785988
11:21:18.26:   8788063 blocks of effective free queue space available; setting disk limit accordingly.
11:21:18.26:   1392994 blocks of free temporary space available; setting disk limit accordingly.
11:21:18.26: Sending    : "354 Enter mail, end with a single "."."
11:26:18.27: os_smtp_read: [9] network read failed with error 145
11:26:18.27:     Error: Connection timed out
11:26:18.27:   Generating V records for all addresses on channel ims-ms                          .
11:26:18.27: mmc_flatten_address: Flattening address tree into a list.
11:26:18.27:   Tree prior to flattening:
11:26:18.27: Level/Node/Left/Right Address
11:26:18.27: 0/0x0072ea30/0x00000000/0x00866050
11:26:18.27: 1/0x00866050/0x00751ef8/0x00751ef8 ouruser@ims-ms-daemon
11:26:18.27: Zero address: 0x00751ef8
11:26:18.27: smtpc_enqueue returning a status of 137 (Timeout)
11:26:18.27: SMTP routine failure from SMTPC_ENQUEUE
11:26:18.27: pmt_close: [9] status 0Apparently, tcp_smtp_server didn't get a byte for 5 minutes so a read() call was locked
and perhaps this is what didn't allow stop-msg to kill this process...
At least, there's no specific failing routine in Sunmail, so I'm back to research about
Sendmail and networking, buffering and so on. As I mentioned, when relay's sendmail
process is killed, the system spits out about 3 packets of header data to the network...
The pstack output for a waiting tcp_smtp_server process follows, for completeness sake:
23594: /opt/SUNWmsgsr/messaging64/lib/tcp_smtp_server
----------------- lwp# 1 / thread# 1 --------------------
fffffd7ffd830007 lwp_park (0, 0, 0)
fffffd7ffd829c14 cond_wait_queue () + 44
fffffd7ffd82a1a9 _cond_wait () + 59
fffffd7ffd82a1d6 cond_wait () + 26
fffffd7ffd82a219 pthread_cond_wait () + 9
fffffd7ffededf3e dispatcher_initialize () + 66e
0000000000404078 main () + 768
00000000004036fc ???????? ()
----------------- lwp# 2 / thread# 2 --------------------
fffffd7ffd830007 lwp_park (0, fffffd7ffc5fdda0, 0)
fffffd7ffd829c14 cond_wait_queue () + 44
fffffd7ffd82a012 cond_wait_common () + 1c2
fffffd7ffd82a286 _cond_timedwait () + 56
fffffd7ffd82a310 cond_timedwait () + 30
fffffd7ffd82a359 pthread_cond_timedwait () + 9
fffffd7ffd520ff4 PR_WaitCondVar () + 264
fffffd7ffd529854 PR_Sleep () + 74
fffffd7ffd62d5d8 LockPoller () + 88
fffffd7ffd5289e7 _pt_root () + f7
fffffd7ffd82fd5b _thr_setup () + 5b
fffffd7ffd82ff90 _lwp_start ()
----------------- lwp# 3 / thread# 3 --------------------
fffffd7ffd830007 lwp_park (0, fffffd7ffc3fdda0, 0)
fffffd7ffd829c14 cond_wait_queue () + 44
fffffd7ffd82a012 cond_wait_common () + 1c2
fffffd7ffd82a286 _cond_timedwait () + 56
fffffd7ffd82a310 cond_timedwait () + 30
fffffd7ffd82a359 pthread_cond_timedwait () + 9
fffffd7ffd520ff4 PR_WaitCondVar () + 264
fffffd7ffd529854 PR_Sleep () + 74
fffffd7ffd62d5d8 LockPoller () + 88
fffffd7ffd5289e7 _pt_root () + f7
fffffd7ffd82fd5b _thr_setup () + 5b
fffffd7ffd82ff90 _lwp_start ()
----------------- lwp# 4 / thread# 4 --------------------
fffffd7ffd830007 lwp_park (0, 0, 0)
fffffd7ffd829c14 cond_wait_queue () + 44
fffffd7ffd82a1a9 _cond_wait () + 59
fffffd7ffd82a1d6 cond_wait () + 26
fffffd7ffd82a219 pthread_cond_wait () + 9
fffffd7ffedf5fe8 pmt_refresh_stats () + d8
fffffd7ffd82fd5b _thr_setup () + 5b
fffffd7ffd82ff90 _lwp_start ()
----------------- lwp# 5 / thread# 5 --------------------
fffffd7ffedecf10 dispatcher_read(), exit value = 0x0000000000000000
        ** zombie (exited, not detached, not yet joined) **
----------------- lwp# 6 / thread# 6 --------------------
fffffd7ffd830007 lwp_park (0, fffffd7ffc1fded0, 0)
fffffd7ffd829c14 cond_wait_queue () + 44
fffffd7ffd82a012 cond_wait_common () + 1c2
fffffd7ffd82a286 _cond_timedwait () + 56
fffffd7ffd82a310 cond_timedwait () + 30
fffffd7ffd82a359 pthread_cond_timedwait () + 9
fffffd7ffeded829 dispatcher_housekeeping () + 1e9
fffffd7ffd82fd5b _thr_setup () + 5b
fffffd7ffd82ff90 _lwp_start ()
----------------- lwp# 14 / thread# 14 --------------------
fffffd7ffd83319a lwp_wait (d, fffffd7ffbdfdf24)
fffffd7ffd82c9de _thrp_join () + 3e
fffffd7ffd82cbbc pthread_join () + 1c
fffffd7ffedece66 dispatcher_joiner () + 36
fffffd7ffd82fd5b _thr_setup () + 5b
fffffd7ffd82ff90 _lwp_start ()
----------------- lwp# 13 / thread# 13 --------------------
fffffd7ffd832caa pollsys (fffffd7ffc1b9860, 1, fffffd7ffc1b97a0, 0)
fffffd7ffd7d9dc2 poll () + 52
fffffd7ffee6d7e8 pmt_recvfrom () + 868
0000000000405a3f os_smtp_read () + 1ff
0000000000404e3d smtp_get () + 9d
fffffd7ffec0fda7 big_smtp_read () + 797
fffffd7ffec36798 data () + a28
fffffd7ffec460ad smtpc_enqueue () + f9d
0000000000405343 tcp_smtp_slave () + 223
00000000004038a4 tcp_smtp_slave_pre () + 54
fffffd7ffedeccbc dispatcher_newtcp () + 46c
fffffd7ffd82fd5b _thr_setup () + 5b
fffffd7ffd82ff90 _lwp_start ()

Similar Messages

HT1386 Sry for disturbing, but i would to ask that how to overcome the problem of synchronising ? Itune showing me that the sync session is failed to start with my Ipad mini. Any ideas about what is going on ?

Sry for disturbing, but i would to ask that how to overcome the problem of synchronising ? Itune showing me that the sync session is failed to start with my Ipad mini. Any ideas about what is going on ?

What are you trying to sync?
I see you are on ios 7.
I also had upgraded my iPad Mini to ios 7 & then the 1st time I tried to open a Numbers file that was set up to sync w my MacBookPro, I got a warning that I would not be able to sync it any more w my Mac until I upgraded to Mavericks.
So if you are trying to sync Numbers, Pages or Keynote files between your iPad mini in ios 7 and a Mac that hasn't been upgraded to Mavericks that could be the problem.
I was very shocked to get that message since I bought Numbers on the iPad so that I could have a particular file on both devices & keep them synced. I had no clue when I upgraded to ios 7 on the iPad Mini that it would make it so the files in the iWorks apps woudln't sync any more.
So I finally decided to go ahead & upgrade to Mavericks after carefully preparing & updating other software so that it would work w Mavericks. But I got error messages so haven't even been able to download Mavericks & now having 2nd thoughts about doing so.
iPad mini ios 7
MacBookPro Mid 2012 Mountain Lion
iphone 4S ios 6

HT4910 My iCal has frozen after I tried to paste info from an email into a reminder. I get a "server reports an error" message with what I tried to paste in the box. Calendar is frozen and will only close by forcing it. Any help?

My iCal has frozen after I tried to paste info from an email into a reminder. I get a "server reports an error" message with what I tried to paste in the box. Calendar is frozen and will only close by forcing it. Any help?

Did another window open with the pasted information? If so, try hitting "enter/return" key and "delete" key at the same time. That should get rid of the current action on the ical, the extra window should go bye-bye, and all should be fine with the world.

Global query block is causing a DNS server to fail a query with error code Name Error exists in the DNS database for WPAD

Global query block is causing a DNS server to fail a query with error code Name Error exists in the DNS database for WPAD on a Windows 2008 server.

The global query block list is a feature that prevents attacks on your network by blocking DNS queries for specific host names. This feature has caused the DNS server to fail a query with error code NAME ERROR for wpad.contoso.com. even though data
for this DNS name exisits in the DNS database. Other queries in all locally authoritative zones for other names that begin with labels in the block list will also fail, but no event will be logged when further queries are blocked until the DNS server
service on this computer is restarted.

I am getting error messages when importing CD's. I keep getting a message saying 'unable to connect to CDDB server'. I run itunes on a macbook pro. Any ideas/solutions?

I am getting error messages when importing CD's. I keep getting a message saying 'unable to connect to CDDB server'. I run itunes on a macbook pro. Any ideas/solutions?

Don't worry I've sorted it! I just had to turn off Reminders as well in iCloud. Calendar then worked fine, even when I turned Calendar and Reminders back on.

HT204408 I keep getting this error message trying to get into facetime, any ideas? "The server encountered an error processing your registration, please try again later"

I keep getting this error message trying to get into facetime, any ideas? "The server encountered an error processing your registration, please try again later"

Hello Mikeytsmith
Check out the article below for troubleshooting the issue of activating FaceTime for Mac.
FaceTime, Game Center, Messages: Troubleshooting sign in issues
http://support.apple.com/kb/TS3970
Thanks for using Apple Support Communities.
Regards,
-Norm G.

Iphone 5 - When people call my phone they get a message saying this phone is currently unavailable, even though phone is switched on with plenty of signal. Any ideas?

Iphone 5 - When people call my phone they get a message saying this phone is currently unavailable, even though phone is switched on with plenty of signal. Any ideas?

Have you contacted your service provider to find out if there is a service interuption in your area or geographical region?

Hi, I can't email photos from iphoto as the email server doesn't recognise my password / username combination... any ideas how i fix this?

Hi, I can't email photos from iphoto as the email server doesn't recognise my password / username combination... any ideas how / where i fix this?

Go to iPhoto's Accounts preference pane and delete the email account there. Then add it back again.
If that fails you can set Mail at the email client for iPhoto in iPhoto's General preference pane. Mail is more reliable, gives you a sent copy and has better access to your contacts. It also has photo templates if you need them. They are in the stationary selection menu.
OT

I updated my Mac Book Pro to Yosemite, now to says it's not compatible with my Cannon copier. Any idea how to fix this problem?

I updated my Mac Book Pro to Yosemite, now it says it's not compatible with my Cannon copier. Any ideas how to fix this problem so I can use my Cannon copier again? Thanks for any help!

If MF4150 is correct...
http://www.usa.canon.com/cusa/home
search MF4150
click >Drivers & Software
Verify or choose Yosemite 10.10 - then click the [+]
Again verify Yosemite > click agree > click [DOWNLOAD]
The rest should be self-explanatory
ÇÇÇ

I hate the new operating system. Can't connect to wifi in home anymore even after a restore. Rhapsody no longer works on my phone either. Plus the brighth graphics are bothering my eyes even with brightness turned down. Any ideas on why rhapsody doesn't w

I hate the new operating system. Can't connect to wifi in home anymore even after a restore. Rhapsody no longer works on my phone either. Plus the brighth graphics are bothering my eyes even with brightness turned down. Any ideas on why rhapsody doesn't why rhapsody doesn't work anymore?

Contact Rhapsody to see if they have or plan to release an update for the app. Try uninstalling and reinstalling the app. Have you tried resetting your device by pressing and holding the Home button and power button until the silver apple appears?

I have an iPhone 4 and want to be able to send photos via ams. I hold down the copy button on the picture but when I go to paste it in the text of a message it won't paste. Any ideas on this please.Thanks

I have an iPhone 4 and want to be able to send photos via sms. I hold down the copy button on the picture but when I go to paste it in the text of a message it won't paste. Any ideas on this please??Thanks

No phone can send photos via SMS. You would need MMS (or iMessage or a 3rd party app such as WhatsApp). SMS messages are limited to 160 characters of text only.

Update to latest iOS. Keep getting pop up message to "Connect to iTunes to use Push Notifications". This message will not go away. Any ideas on how to fix?

Update to latest iOS. Keep getting pop up message to "Connect to iTunes to use Push Notifications". This message will not go away. Any ideas on how to fix?

Hi
I have the same Problem at my I phone 4s , its killing me all application not working .
now WE NEED HELP ,,,,i could not even know how to connect to I TUNES
Thanks

I am using PP 2014 CC and it and or Encoder keep crashing.. using iMac computer with Yosemite OS anyone have any ideas about this issue?

I am using PP 2014 CC and it and or Encoder keep crashing.. using iMac computer with Yosemite OS anyone have any ideas about this issue?

Hi,
Please give this a try: Premiere Pro CC, CC 2014, or 2014.1 freezing on startup or crashing while working (Mac OS X 10.9, and later)
Thanks,
Rameez

I keep trying to send emails and it says its rejected by the server because it doesn't allow relaying... Any ideas how to fix?

I keep trying to send emails and it says its rejected by the server because it doesn't allow relaying... Any ideas how to fix?

There is something else that can cause this issue. Check the outgoing mail server setting. Make sure that your username and password are in there.
Settings>Mail, Contacts, Calendars>Your email account>Account>Outgoing mail server - tap the server name next to SMTP and check in the primary server and make sure your username and password are entered and correct - even if it says that the password is optional.

Open with camera raw is gone, any ideas?

Open with camera raw is gone, any ideas thanks?

AND FROM THE FINDER:
Kevin, you keep trying to open files in ACR using the right mouse click menu in Finder and it will never show up. Use Bridge, select the file in the content window and then use right mouse click on a supported file.
Supported are Raw files according to the list provided on this page:
http://helpx.adobe.com/creative-suite/kb/camera-raw-plug-supported-cameras.html
And jpeg and tiff if you have set the camera raw preferences to handling jpeg and tiff files.
But this works only in Bridge because ACR is a plug in that opens in PS and Bridge. If you choose to the Open With Route you always open ACR in Bridge.
Setting jpeg and tiff to 'open automatically with settings' means that if you have previous applied ACR settings (the icon for it shows top right of the thumb) you are able to double click or use open to open the files in ACR using PS.
Setting the prefs to open jpeg and tiff always in ACR means you can use the same options for opening Raw files, but this is not recommended because opening a simple jpeg file in mail or whatever also opens the file first in ACR instead of PS.

Server silently fails on messages with a huge To: header; any ideas?

Similar Messages

Maybe you are looking for