From: lundman on 5 Dec 2007 01:41 Solaris 10, x86 sendmail-8.13.7 I am currently experiencing a problem on multiple servers. clientmqueue has a handful of messages that it is attempting to deliver to localhost:25. One of these messages fails, for whatever reason, and sendmail -Ac receives a Broken Pipe. All remaining emails in clientmqueue are then automatically deferred. This will go on until the emails have expired and are bounced. (or I clean out the offending email). I would guess that one of my timeouts is set too low which is why I receive the timeout error, and abrupt disconnect. Naturally sendmail - Ac will then receive Broken Pipe. But it feels undesirable that it would simply remember the problem with the connection to localhost:25, and defer all other emails without trying. Can this be disabled somehow? I have not using HostStatusDirectory (although we were), nor StatusFile (although default Sun submit.cf was). Details: Running /var/spool/clientmqueue/lB2JL9wt008760 (sequence 1 of 53) iriiri21(a)censored... Connecting to [127.0.0.1] via relay... [snip] 250 2.1.5 <iriiri21(a)censored>... Recipient ok 354 Enter mail, end with "." on a line by itself myschool.co.kr: Name server timeout anet.ne.jp: Name server timeout >>> . 421 4.4.1 collect: read timeout on connection from localhost, from=<customer(a)our-host> >>> QUIT iriiri21(a)censored... Deferred: 421 4.4.1 collect: read timeout on connection from localhost, from=<customer(a)our-host> Running /var/spool/clientmqueue/lB1BI2Ql026119 (sequence 2 of 53) censored(a)email.com... Deferred Running /var/spool/clientmqueue/lB1AQXfJ001015 (sequence 3 of 53) censored(a)email.com... Deferred Running /var/spool/clientmqueue/lB1ALu4c023269 (sequence 4 of 53) censored(a)email.com... Deferred truss details: 23344: =\r\n - - - - - - = _ N e x t P a r t _ 0 0 0 _ 0 0 A 7 _ 5 A 1 23344: 6 B C 8 3 . E 3 D D 7 4 3 E - -\r\n\r\n 23344: write(1, " > > > .\n", 6) = 6 23344: write(6, " .\r\n", 3) Err#32 EPIPE 23344: Received signal #13, SIGPIPE [ignored] 23344: read(7, 0x08279A28, 8192) = 99 23344: 4 2 1 4 . 4 . 1 c o l l e c t : r e a d t i m e o u t 23344: o n c o n n e c t i o n f r o m l o c a l h o s t , f r 23344: o m = < truss confirms that it does no reading of HostStatusDirectory, or StatusFile, it simply just iterates the remaining clientmqueue entries and defers them. It would then sleep 30 mins, and try again. Since the first email is the same problem email it will fail again, then defer all remaining emails. Now, it would seem my system has a timeout too lean, but fixing that would only make the problem less likely to occur. I can run 2 (or more) clientmqueue runners, but that too, would also just make it less likely to occur. Is there someway to disable this feature for submit.cf so that it does not remember a previous connection failure to localhost:25? Even though my configuration settings are most likely bad, it would seem undesirable that, for whatever failure reason, if you get a message that will never pass delivery, all other deliveries to localhost are ignored, and remain so until the problem is cleared or all messages are bounced.
From: Per Hedeland on 6 Dec 2007 17:22 In article <f970ebf7-35a9-4d06-8ce1-cea7ab1e3184(a)d4g2000prg.googlegroups.com> "lundman(a)lundman.net" <lundman(a)lundman.net> writes: > >I would guess that one of my timeouts is set too low which is why I >receive the timeout error, and abrupt disconnect. Naturally sendmail - >Ac will then receive Broken Pipe. But it feels undesirable that it >would simply remember the problem with the connection to localhost:25, >and defer all other emails without trying. Can this be disabled >somehow? I have not using HostStatusDirectory (although we were), nor >StatusFile (although default Sun submit.cf was). The "host status" is always cached during a queue run, the HostStatusDirectory allows for remembering it *between* queue runs, and StatusFile is not relevant at all here (it's effectively write-only as far as sendmail is concerned). >Running /var/spool/clientmqueue/lB2JL9wt008760 (sequence 1 of 53) >iriiri21(a)censored... Connecting to [127.0.0.1] via relay... >[snip] >250 2.1.5 <iriiri21(a)censored>... Recipient ok >354 Enter mail, end with "." on a line by itself >myschool.co.kr: Name server timeout >anet.ne.jp: Name server timeout >>>> . >421 4.4.1 collect: read timeout on connection from localhost, >Now, it would seem my system has a timeout too lean, but fixing that >would only make the problem less likely to occur. Well, that depends - if you can limit the time "things" will take, you can be sure that you don't exceed a timeout. And of course it is "silly" that the delivery to localhost:25 by the MSP should ever take long enough that the MTA times out. The "things" here are specifically DNS lookups for canonicalization of addresses in headers - this is pretty pointless to do in the MSP since the MTA will do it anyway (unless you have disabled it there) - check out the section MESSAGE SUBMISSION PROGRAM in cf/README for a way to handle this. HOWEVER, I would be very suspicious of why you have messages in your clientmqueue with such problematic addresses in the headers in the first place - the only thing there should normally be locally generated mail from scripts or local users with "simplistic" MUAs. It seems somewhat unlikely that they would generate mail with unresolvable domains - make sure that you don't have some broken cgi script that allows spammers to abuse your box (i.e. first of all check that the "problematic" message is really legitimate). >Is there someway to disable this feature for submit.cf so that it does >not remember a previous connection failure to localhost:25? If you finally want to go that route, you can set confTO_HOSTSTATUS to 0 in submit.mc, see cf/README. --Per Hedeland per(a)hedeland.org
From: lundman on 6 Dec 2007 19:32 On Dec 7, 7:22 am, p...(a)hedeland.org (Per Hedeland) wrote: > In article > <f970ebf7-35a9-4d06-8ce1-cea7ab1e3...(a)d4g2000prg.googlegroups.com> > > The "host status" is always cached during a queue run, the > HostStatusDirectory allows for remembering it *between* queue runs, and > StatusFile is not relevant at all here (it's effectively write-only as > far as sendmail is concerned). Ah interesting. It had to be an in-memory caching of some sort, I just did not know the name it would use. > Well, that depends - if you can limit the time "things" will take, you > can be sure that you don't exceed a timeout. And of course it is > "silly" that the delivery to localhost:25 by the MSP should ever take > long enough that the MTA times out. The "things" here are specifically > DNS lookups for canonicalization of addresses in headers - this is > pretty pointless to do in the MSP since the MTA will do it anyway > (unless you have disabled it there) - check out the section MESSAGE > SUBMISSION PROGRAM in cf/README for a way to handle this. So the MSP looks up hostnames, that does seem rather pointless. But there probably is a side-effect in disabling it in MSP. > HOWEVER, I would be very suspicious of why you have messages in your > clientmqueue with such problematic addresses in the headers in the first > place - the only thing there should normally be locally generated mail > from scripts or local users with "simplistic" MUAs. It seems somewhat > unlikely that they would generate mail with unresolvable domains - make > sure that you don't have some broken cgi script that allows spammers to > abuse your box (i.e. first of all check that the "problematic" message > is really legitimate). Oh it was spam, for sure. Which makes it worse in a way, that one bad spam message will defer, and eventually bounce, all other legitimate emails. Not thousands of spam, just one message :( > If you finally want to go that route, you can set confTO_HOSTSTATUS to > 0 in submit.mc, see cf/README. > It sounds like perhaps I do not want to go that route from your hints? It is nice to know the root cause, Admittedly in this situation it was a slow DNS resolver, but imagine that a specific emails triggers MTA to core dump, every time. Creating Broken Pipes, and from that, all other emails deferred etc. Insulting the programmers aside, and that it would be so unlikely it isn't funny, wouldn't you want to remove the problem that could bounce legitimate emails? Thank you very much for you reply. I have increased the timeout, so at the very least this should be unlikely to happen again. Now to apologise to all customers... Lund
From: Per Hedeland on 7 Dec 2007 18:44 In article <e8acc4e0-92f9-4314-b562-0625bdf1eeab(a)a35g2000prf.googlegroups.com> "lundman(a)lundman.net" <lundman(a)lundman.net> writes: >On Dec 7, 7:22 am, p...(a)hedeland.org (Per Hedeland) wrote: > >> Well, that depends - if you can limit the time "things" will take, you >> can be sure that you don't exceed a timeout. And of course it is >> "silly" that the delivery to localhost:25 by the MSP should ever take >> long enough that the MTA times out. The "things" here are specifically >> DNS lookups for canonicalization of addresses in headers - this is >> pretty pointless to do in the MSP since the MTA will do it anyway >> (unless you have disabled it there) - check out the section MESSAGE >> SUBMISSION PROGRAM in cf/README for a way to handle this. > >So the MSP looks up hostnames, that does seem rather pointless. But >there probably is a side-effect in disabling it in MSP. Did you check cf/README? >> HOWEVER, I would be very suspicious of why you have messages in your >> clientmqueue with such problematic addresses in the headers in the first >> place - the only thing there should normally be locally generated mail >> from scripts or local users with "simplistic" MUAs. It seems somewhat >> unlikely that they would generate mail with unresolvable domains - make >> sure that you don't have some broken cgi script that allows spammers to >> abuse your box (i.e. first of all check that the "problematic" message >> is really legitimate). > >Oh it was spam, for sure. So how did it end up in your clientmqueue? >> If you finally want to go that route, you can set confTO_HOSTSTATUS to >> 0 in submit.mc, see cf/README. > >It sounds like perhaps I do not want to go that route from your hints? In general it's obviously sub-optimal, but I guess it's no real problem in the MSP - e.g. AFAIK it will keep trying to connect to the MTA for each message in the queue even if the MTA is actually down or something, but it should be pretty cheap and maybe there's nothing else it could spend CPU cycles on anyway... But I certainly think it's more of a hack than the alternatives. --Per Hedeland per(a)hedeland.org
|
Pages: 1 Prev: how let it sendmail + cyrus + vacation (not sieve) Next: Sendmail / Milter problem |