|
/
Zope
/
Apsis
/
Pound Mailing List
/
Archive
/
2007
/
2007-01
/
load leveling among backends
[
ANNOUNCE: Pound - reverse proxy and load balancer ... ]
[
Why pound logs to 2 files / Zoong PHAM ... ]
load leveling among backends
John Madden <jmadden(at)ivytech.edu> |
2007-01-04 16:20:15 |
[ FULL ]
|
New to pound here, just deployed it (v2.2) last night for a farm of
portal servers with sticky sessions (what a hack that is, btw -- cookie
insertion would be an excellent feature-add). What I'm finding is that
the backends aren't getting an even request distribution.
Searching the archives, I found posts referencing a fully-random
election process, but then later something in the v2.1 series
referencing a new feature that apparently caused some bad results:
http://www.apsis.ch/pound/pound_list/archive/2006/2006-11/1163848104000
I'm seeing something similar here -- two out of three machines with
roughly equivalent session counts but the third with about 2.5x the
count of the other two. Is this "dynamic" load balancing feature still
in effect?
What should the expected behavior of v2.2 be regarding request
distribution among backends? I understand that sticky sessions will
skew *request* counts, but I'm not talking about *requests* here. I'm
looking for a way to ensure that for a new/yet-unknown session it gets
assigned to a backend in at least a random measure.
Thanks,
John
[...]
|
|
|
RE: [Pound Mailing List] load leveling among backends
"Joe Gooch" <mrwizard(at)k12system.com> |
2007-01-04 17:51:37 |
[ FULL ]
|
./configure --disable-dynscale
I believe it is enabled by default.
Joseph Gooch
Sapphire Suite Product Manager
K12 Systems, Inc.
(866) 366-9540
[...]
cookie[...]
that[...]
still[...]
pound(at)apsis.ch.[...]
|
|
|
RE: [Pound Mailing List] load leveling among backends
John Madden <jmadden(at)ivytech.edu> |
2007-01-04 19:23:09 |
[ FULL ]
|
On Thu, 2007-01-04 at 11:51 -0500, Joe Gooch wrote:[...]
Yes, it seems it is. I'll give this a go -- thanks.
John
[...]
|
|
|
RE: [Pound Mailing List] load leveling among backends
John Madden <jmadden(at)ivytech.edu> |
2007-01-05 15:16:56 |
[ FULL ]
|
On Thu, 2007-01-04 at 11:51 -0500, Joe Gooch wrote:[...]
In case anyone is interested in an update, this did indeed fix the
issue. Might it be a general recommendation that those implementing
session affinity not even bother with dynscale?
John
[...]
|
|
|
Re: [Pound Mailing List] load leveling among backends
Robert Segall <roseg(at)apsis.ch> |
2007-01-06 11:34:06 |
[ FULL ]
|
On Thu, 2007-01-04 at 10:20 -0500, John Madden wrote:[...]
Session IP is pretty much the equivalent of cookie insertion, and less
intrusive, as it allows the proxy to be truly transparent.
Long term the request distribution is even: Pound measures the response
times from the various back-ends and changes the priorities so that it
is evenly balanced. That means that the nature of the requests may
influence the number of requests sent to each back-end: if a certain
client makes "heavy" requests it will get slower response times and thus
cause the priority to be lowered. I agree however that the rescaling
could do with some fine-tuning.[...]
|
|
|
Re: [Pound Mailing List] load leveling among backends
Ted Dunning <tdunning(at)veoh.com> |
2007-01-06 21:16:29 |
[ FULL ]
|
Measurements like this can often lead to very surprisingly unstable
behavior. The instability comes because components which are not directly
connected start to couple their behavior due to timing effects.
A case from my experience last year can illustrate what I mean. We had
software in the wild that would check in with a very low cost access every
three minutes. The timing was implemented in such a way that the 3 minute
period was from the end of one transaction to the beginning of the next.
After the software was running for a few days, we observed very large peak
to valley ratios (10x or higher) in the number of transactions per second.
This was astounding because the software checking in was running on
thousands of completely independent computers. In spite of their
independences, these computers were synchronizing their behavior in such a
way as to eventually cause the central check-in system to fail.
The mechanism of synchronization was due to the way that the timing was
implemented. To understand what was happening, it is best to reduce all of
the transaction times to where in a repeating window they appear. The
length of the window is the average time between transactions. It also
helps to orient the window so that a random peak in volume is at the
beginning of the repeating window.
What happens is that transactions that happen to fall at the beginning of
the window where there is a peak in volume will take more than average time
to repeat and thus will move later in the window (i.e. move towards the
trailing egde of the peak). Transactions that occur in areas with lower
than average transaction volume will complete faster than average and thus
they will move earlier in the window (i.e. toward the peaks). The effect is
that a transaction peak will move through the window, ultimately collecting
all of the transactions into a single spike of traffic.
This example is VERY different in detail from what happens in load
balancing, but the basic idea that seemingly independent computers can
synchronize their behavior does apply to load balancers. Any time you
deviate from completely randomized assignments, you are effectively
introducing this sort of coupling because the effect of one transaction
assignment interacts with other transaction assignments. One of the
simplest ways that this occurs in load balancers is when a transient heavy
load on one server causes a sharp increase in the number of transactions
being assigned to a particular other server. After the transient load
resolves on the first server, it is under-assigned and at the same time the
second server is over-loaded. Without very careful design, this dynamic
coupling of loading causes oscillation. Moreover, the oscillation can be
fast and asymmetric causing an appearance of persistent imbalance.
There are no particularly good solutions to this problem. You want to
respond quickly when a server bogs down, but you want to respond very
deliberately and slowly to avoid oscillations. There are a few design
guidelines that can help:
A) if you do use response time as a measure of load/capacity, you need to
average this over a relatively long period of time.
B) all assignments should be done using weighted random methods and NEVER by
linear scan, even when a dead host is found.
C) number of connections is usually the best short term thing to balance.
My recommendation is usually to slowly adapt a target number of connections
per host using response time metrics, but to always use randomized selection
based on weighted number of connections when assigning incoming connections.
Sorry for the long post. I hope it doesn't come across as condescending.
On 1/6/07 2:34 AM, "Robert Segall" <roseg(at)apsis.ch> wrote:
[...][...][...]
|
|
|
RE: [Pound Mailing List] load leveling among backends
"John Snowdon" <J.P.Snowdon(at)newcastle.ac.uk> |
2007-01-08 09:48:04 |
[ FULL ]
|
An interesting read, Ted.
John Snowdon - IT Support Specialist
-==========================================-
School of Medical Education Development
Faculty of Medical Sciences Computing
University of Newcastle
Email : j.p.snowdon(at)ncl.ac.uk
[...]
>>> New to pound here, just deployed it (v2.2) last night for a farm
of
>>> portal servers with sticky sessions (what a hack that is, [...]
>>> insertion would be an excellent feature-add). What I'm [...]
>>> the backends aren't getting an even request
distribution.[...][...][...][...][...][...][...][...][...][...]
24015000/1168114589000
|
|
|
Re: [Pound Mailing List] load leveling among backends
Robert Segall <roseg(at)apsis.ch> |
2007-01-08 18:27:54 |
[ FULL ]
|
On Sat, 2007-01-06 at 12:16 -0800, Ted Dunning wrote:[...]
Agreed. In the coming 2.2.2 we'll see much longer averaging periods (15
minutes rather than 30 seconds).
[...]
Correct - that is exactly how things are done.
[...]
I suppose you mean number of requests - connections are a difficult
concept in HTTP. Unfortunately we have to occasionally rescale the
numbers in order to avoid overflow, so we use average response times
instead, even with the risk of a sub-optimal metric.
[...]
As you correctly remarked, the biggest problem is introducing enough
hysteresis to avoid thrashing. Hopefully the newer code will solve that
- it's in testing right now. I hope you'll provide some feedback once we
release it.
[...]
I only wish we had more posts like yours - please keep it up.[...]
|
|
|
Re: [Pound Mailing List] load leveling among backends
Ted Dunning <tdunning(at)veoh.com> |
2007-01-08 19:02:21 |
[ FULL ]
|
On 1/8/07 9:27 AM, "Robert Segall" <roseg(at)apsis.ch> wrote:
[...][...]
Sorry, I used bad terminology here.
I should have said balancing the number of pending requests. A
semi-deterministic rule that gives new requests to the back-end with the
smallest number of outstanding requests (scaled according to the long term
averages). Ties should be broken using uniform random selection.
The effect of this is very fast adaptation in the case of server apoplexia.
If a server suddenly takes longer to process requests, perhaps due to a
garbage collection, it will pretty much instantly be given no additional
requests until it works down the backlog. I don't think that this is a good
idea if any kind of session persistence is enabled since you get back into
questions of stability. If your session creation rate is low enough, then
you will have a stable system, but in that case your load balancing will
also not be all that effective either. Better to just use random assignment
reweighted by long-term average response time.
|
|
|
|