From: Florin Andrei on
On 07/06/2010 11:30 AM, Victor Duchovni wrote:
>
> No, disabling the cache will still leave a skewed distribution. Connection
> creation is uniform across the servers, but connection lifetime is much
> longer on the slow server, so its connection concurrency is much higher
> (potentially equal to the destination concurrency limit under suitable
> conditions, thus keeping the fast servers essentially idle).
>
> A time-based cache is the fairness mechanism that keeps connection
> lifetimes uniform across the servers, which ensures non-starvation
> of fast servers, and avoids futher overload of (congested) slow servers.

I see.

I realize that email delivery is not a trivial problem, but it seems
baffling that a seemingly simple task ("fair" volume-based load
balancing between transports) is so hard to achieve.

A very dumb algorithm should accomplish it: single-threaded delivery (no
concurrency), a "voluntary" (sender-side) limit of N messages delivered
per connection, then reconnect. DNS randomization should then do the
trick. If the network and the servers are fast (and they are, in my
case), this shouldn't slow down the delivery too much (in fact, a small
speed decrease might be beneficial).

I think I know how to eliminate concurrency, but I'm lacking a
volume-based limit for the connections.

I'll keep looking for a solution.

--
Florin Andrei
http://florin.myip.org/

From: Victor Duchovni on
On Tue, Jul 06, 2010 at 12:10:41PM -0700, Florin Andrei wrote:

> I realize that email delivery is not a trivial problem, but it seems
> baffling that a seemingly simple task ("fair" volume-based load balancing
> between transports) is so hard to achieve.

If you want to deliver the same number of messages to each server,
regardless of server performance, (message-count fairness, rather than
concurrency fairness), and suffer high latency when a slow server starts
to impede message flow, then turning off the cache will indeed give you
roughly uniform message distribution:

- *New* connections are distributed uniformly
- There is at most one delivery per connection
- Hence messages are distributed uniformly

However, concurrency will not be distributed uniformly, and a slow
server will account for most or all of the concurrency, ensuring a
high average latency even when alternative servers are sitting idle.

> I'll keep looking for a solution.

What negative symptoms are your systems exhibiting?
What *real* problem are you trying to solve?

--
Viktor.

From: Florin Andrei on
On 07/06/2010 12:27 PM, Victor Duchovni wrote:
>
> If you want to deliver the same number of messages to each server,
> regardless of server performance, (message-count fairness, rather than
> concurrency fairness), and suffer high latency when a slow server starts
> to impede message flow, then turning off the cache will indeed give you
> roughly uniform message distribution:
>
> - *New* connections are distributed uniformly
> - There is at most one delivery per connection
> - Hence messages are distributed uniformly
>
> However, concurrency will not be distributed uniformly, and a slow
> server will account for most or all of the concurrency, ensuring a
> high average latency even when alternative servers are sitting idle.

That's fine. One transport is on the local network, the other is across
a data link that would have been considered "as fast as local" not too
long ago. Both servers are modern fast hardware. Both are highly
available from the p.o.v. of the machines generating the emails. Even if
one of them disappears, so what, the other will just magically take over
and at most we're not worse off than before.

The "slow" server, therefore, is not that "slow". It's just different
enough (latency, mostly) to tip over the sensitive delivery algorithm,
which seems to be fine-tuned for Internet conditions, rather than local
or near-local networks.

From what you're saying, it appears that single-threaded delivery is
unnecessary - the email "generators" will simply hit the upper
connection limit and stay near it, with newly released slots being
occupied by either one relay or the other at random. That should ensure
a "fair" distribution, I think.

> What negative symptoms are your systems exhibiting?
> What *real* problem are you trying to solve?

The real problem was described in the other big thread I started
recently: delivery to a certain big popular email provider is
exceedingly slow. We have a pretty small delivery window between the
moment the messages are created and the moment they should be available
to the users - that's not a problem with all the other providers (heck,
Gmail for instance seems to absorb emails way faster than we can send
them - this even while their anti-spam filters seem at once more fair
and more effective than the other providers').

We already did long time ago some of the stuff you indicated (the spam
feedback loop, etc.) and have started a while ago working on the rest
(whitelisting, etc.) which is supposed to get us out of the red zone.
But *meanwhile* I have to make the best out of a tricky set of
mutually-exclusive constraints.

Having multiple exit points seems to improve the overall delivery speed
- this is true even right now, when distribution is skewed to the faster
server 4:1. My estimate is, a near-1:1 distribution would actually fix
our time-constraint problem even before whitelisting. So you see how
this is kind of a big incentive to get it done.

--
Florin Andrei
http://florin.myip.org/

From: Victor Duchovni on
On Tue, Jul 06, 2010 at 01:00:14PM -0700, Florin Andrei wrote:

> Having multiple exit points seems to improve the overall delivery speed -
> this is true even right now, when distribution is skewed to the faster
> server 4:1. My estimate is, a near-1:1 distribution would actually fix our
> time-constraint problem even before whitelisting. So you see how this is
> kind of a big incentive to get it done.

So you have multiple exit points with non-uniform latency, but the more
severe congestion is downstream, so you want to load the exit points
uniformly. Yes, the solution is to disable the connection cache, and
set reasonably low connection and helo timeouts in the transport feeding
the two exit points, so that when one is down and non-responsive (no TCP
reset), you don't suffer excessive hand-off latency for 50% of deliveries.

master.cf:
transp unix ... smtp
-o smtp_connect_timeout=$<transp>_connect_timeout
-o smtp_helo_timeout=$<transp>_helo_timeout

main.cf:
# default is 30s
transp_connect_timeout = 2s
# default is 300s
transp_helo_timeout = 30s

--
Viktor.

From: Florin Andrei on
On 07/06/2010 01:10 PM, Victor Duchovni wrote:
>
> So you have multiple exit points with non-uniform latency, but the more
> severe congestion is downstream, so you want to load the exit points
> uniformly. Yes, the solution is to disable the connection cache, and
> set reasonably low connection and helo timeouts in the transport feeding
> the two exit points, so that when one is down and non-responsive (no TCP
> reset), you don't suffer excessive hand-off latency for 50% of deliveries.

I did that.

You know what? It's amazingly accurate actually. After tens of thousands
of messages, the logs on the two exit points showed almost exactly the
same amount of messages relayed - within 1.2% or so. That was a very
nice result to contemplate.

After disabling the connection cache for internal delivery, it looks
like we took a 2x performance hit internally, which is exactly what I
expected. But that's ok, the internal rate is orders of magnitude above
the Yahoo rate anyway. From an external perspective, things are actually
much better now.

Case closed. Thanks for all the help.

--
Florin Andrei
http://florin.myip.org/