|
/
Zope
/
Apsis
/
Pound Mailing List
/
Archive
/
2006
/
2006-11
/
Uneven balancing with 2.1.6
[
Feature Request: configuration includes / Blake ... ]
[
Session type IP not working in Pound 2.1.6 / ... ]
Uneven balancing with 2.1.6
Steve <spm(at)fostam.franken.de> |
2006-11-18 12:08:24 |
[ SNIP ]
|
Hello,
Recently I've experienced quite uneven balancing with pound 2.1.6.
When distributing among 3 servers (all w/o priority, i.e. should be
equal round robin), there are some servers getting more than twice as
much requests as others. After a short period of time, the situation is
vice versa (the server getting least requests now gets most), and so on.
First I thought it was because pound moved from a single CPU to a
multiple CPU system (and by that was upgraded to 2.1.6).
Now I've read that in 2.1.5 "dynamic rescaling" has been introduced. So
I switched back to 2.1.4, and now the problem seems to be solved.
So, could my problem be caused by the dynamic rescaling? If yes, is
there any configuration option to disable it in the most recent versions
(haven't found one)? Or will I have to stick to 2.1.4?
Thanks in advance,
Steve
|
|
|
Re: [Pound Mailing List] Uneven balancing with 2.1.6
"Yves Junqueira" <yves.junqueira(at)gmail.com> |
2006-11-19 22:33:45 |
[ SNIP ]
|
On 11/18/06, Steve <spm(at)fostam.franken.de> wrote:
>
> Hello,
>
> Recently I've experienced quite uneven balancing with pound 2.1.6.
>
> When distributing among 3 servers (all w/o priority, i.e. should be
> equal round robin), there are some servers getting more than twice as
> much requests as others. After a short period of time, the situation is
> vice versa (the server getting least requests now gets most), and so on.
>
> First I thought it was because pound moved from a single CPU to a
> multiple CPU system (and by that was upgraded to 2.1.6).
>
> Now I've read that in 2.1.5 "dynamic rescaling" has been introduced. So
> I switched back to 2.1.4, and now the problem seems to be solved.
>
> So, could my problem be caused by the dynamic rescaling? If yes, is
> there any configuration option to disable it in the most recent versions
> (haven't found one)? Or will I have to stick to 2.1.4?
>
Hi,
I've recently submitted a bug report to debian about this. See
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=399086
I believe that:
"the idea is that the order backends are tested affects the chance
they are picked up - backends tested first has a much bigger chance to
be used. IMO, a real weighted round robin scheme should be used."
Robert, could you shed some light on this? I'd like to help fixing
that, if you agree it's an issue.
--
Yves Junqueira
http://www.cetico.org - yves.junqueira(at)gmail.com
Brasília, Brasil
|
|
|
Re: [Pound Mailing List] Uneven balancing with 2.1.6
Steve <spm(at)fostam.franken.de> |
2006-11-21 00:53:25 |
[ SNIP ]
|
Yves Junqueira wrote:
> "the idea is that the order backends are tested affects the chance
> they are picked up - backends tested first has a much bigger chance to
> be used. IMO, a real weighted round robin scheme should be used."
Hm... according to my measurements, with 2.1.4 the distribution among
the hosts with same priority is equal (1-2% difference max.),
independent from the order/position.
Only with 2.1.6 the described unbalance occurs.
Steve
|
|
|
Re: [Pound Mailing List] Uneven balancing with 2.1.6
Steve Otto <sotto(at)kaboodle-inc.com> |
2006-11-21 02:06:42 |
[ SNIP ]
|
Some daily statistics for 2.1.6 on my systems:
Roughly 600k/day.
On 2.0.5:
on all 3 servers, 33% each
On 2.1.6 (in config file order)
1st is 48%, 2nd is 9%, 3rd is 43%.
On 11/20/06 3:53 PM, "Steve" <spm(at)fostam.franken.de> wrote:
>
>
> Yves Junqueira wrote:
>
>> "the idea is that the order backends are tested affects the chance
>> they are picked up - backends tested first has a much bigger chance to
>> be used. IMO, a real weighted round robin scheme should be used."
>
> Hm... according to my measurements, with 2.1.4 the distribution among
> the hosts with same priority is equal (1-2% difference max.),
> independent from the order/position.
>
> Only with 2.1.6 the described unbalance occurs.
>
> Steve
--
SteveO
|
|
|
Re: [Pound Mailing List] Uneven balancing with 2.1.6
Jacques Caron <jc(at)oxado.com> |
2006-11-24 00:18:33 |
[ SNIP ]
|
Hi,
I've looked into this a bit, and indeed the algorithm used to adjust
the distribution is apparently a bit too rough. If a server is faster
than the others, its priority will be continuously increased until
the server handles nearly all the traffic and probably chokes. We
actually found out that some of our servers which we thought to be
exactly identical, both hardware- and software-wise were actually not
responding at the same speed (still have to find out why!). Obviously
if you have even slightly different hardware, or a topology that
might imply different latencies, this might get even worse.
On top of that, if you have a pretty low Alive value (which is the
interval of priority adjustments), it may start taking decisions with
very little data to work on (and hence not necessarily accurate),
increase the priorities of some servers pretty quickly, and the
others having less traffic will see their stats move to more
"average" values quite slowly and never come back in the game.
Looking at what the big commercial player (F5's BigIP) does, it seems
they have quite simple load balancing methods: round robin, weighted
round robin, "fastest" (which is -involuntarily- what the current
scheme does), and least connections. Won't help us a lot, though the
"least connections" might not be the worst idea, as it is obviously
linked to the response time...
I'm not sure what the "right" approach should be, but the current
code definitely needs at the very least a switch to turn off the
adaptive code (we currently just #if 0'd it), and probably:
- some bounds on the priorities that can be set?
- a minimum number of requests before the priorities are adjusted
(tried that, but that doesn't seem to be enough)
- some way to set relative priorities (e.g. "this server responds 20%
faster, I'll send it 20% more traffic") rather than just continuously
increase the priority until the server chokes.
Haven't tried it, but starting with pretty high priority values (are
they still limited to 1-9?) might help mitigate the drastic priority
adjustments?
While I'm here, a quick patch for poundctl to show a bit more
information about the backends:
> diff -u Pound-2.1.6.orig/poundctl.c Pound-2.1.6/poundctl.c
--- Pound-2.1.6.orig/poundctl.c Sat Nov 4 11:28:55 2006
+++ Pound-2.1.6/poundctl.c Fri Nov 24 00:14:10 2006
(at)(at) -143,8 +143,9 (at)(at)
if(be.disabled < 0)
break;
if(be.domain == PF_INET)
- printf(" %3d. Backend PF_INET %s:%hd
%s\n", n_be++, inet_ntoa(be.addr.in.sin_addr),
- ntohs(be.addr.in.sin_port), be.disabled?
"*D": "a");
+ printf(" %3d. Backend PF_INET %15s:%-5hd
%2s %2s %2d %5d %10.3lf %7.3lf\n", n_be++, inet_ntoa(be.addr.in.sin_addr),
+ ntohs(be.addr.in.sin_port), be.disabled?
"*D": "a", be.alive? "a": "*D",
+ be.priority, be.n_requests,
be.t_requests/1000, be.t_average/1000);
else
printf(" %3d. Backend PF_UNIX %s %s\n",
n_be++, be.addr.un.sun_path,
be.disabled? "*D": "");
(at)(at) -168,8 +169,9 (at)(at)
if(be.disabled < 0)
break;
if(be.domain == PF_INET)
- printf(" %3d. Backend PF_INET %s:%hd %s\n",
n_be++, inet_ntoa(be.addr.in.sin_addr),
- ntohs(be.addr.in.sin_port), be.disabled? "*D": "a");
+ printf(" %3d. Backend PF_INET %15s:%-5hd %2s
%2s %2d %5d %10.3lf %7.3lf\n", n_be++, inet_ntoa(be.addr.in.sin_addr),
+ ntohs(be.addr.in.sin_port), be.disabled?
"*D": "a", be.alive? "a": "*D",
+ be.priority, be.n_requests,
be.t_requests/1000, be.t_average/1000);
else
printf(" %3d. Backend PF_UNIX %s %s\n",
n_be++, be.addr.un.sun_path,
be.disabled? "*D": "");
Jacques.
At 02:06 21/11/2006, Steve Otto wrote:
>Some daily statistics for 2.1.6 on my systems:
>
>Roughly 600k/day.
>
>On 2.0.5:
> on all 3 servers, 33% each
>
>On 2.1.6 (in config file order)
> 1st is 48%, 2nd is 9%, 3rd is 43%.
>
>
>On 11/20/06 3:53 PM, "Steve" <spm(at)fostam.franken.de> wrote:
> >
> >
> > Yves Junqueira wrote:
> >
> >> "the idea is that the order backends are tested affects the chance
> >> they are picked up - backends tested first has a much bigger chance to
> >> be used. IMO, a real weighted round robin scheme should be used."
> >
> > Hm... according to my measurements, with 2.1.4 the distribution among
> > the hosts with same priority is equal (1-2% difference max.),
> > independent from the order/position.
> >
> > Only with 2.1.6 the described unbalance occurs.
> >
> > Steve
>
>
>--
>SteveO
>
>
>--
>To unsubscribe send an email with subject 'unsubscribe' to pound(at)apsis.ch.
>Please contact roseg(at)apsis.ch for questions.
>http://www.apsis.ch/pound/pound_list/archive/2006/2006-11/1163848104000/1164071202000
|
|
|
Re: [Pound Mailing List] Uneven balancing with 2.1.6
Maurice Yarrow <yarrow(at)best.com> |
2006-11-24 05:19:55 |
[ SNIP ]
|
Hello Jacques
Your wrote that:
> I'm not sure what the "right" approach should be, but the current code
> definitely needs at the very least a switch to turn off the adaptive
> code (we currently just #if 0'd i
So, I am using 2.1.6, but would, at this time, like to turn off the
dynamic load balancing.
Could you please tell me just which lines need to be #if 0'd ?
Thanks,
Maurice Yarrow
|
|
|
Re: [Pound Mailing List] Uneven balancing with 2.1.6
Jacques Caron <jc(at)oxado.com> |
2006-11-25 18:42:17 |
[ SNIP ]
|
Hi,
At 05:19 24/11/2006, Maurice Yarrow wrote:
So, I am using 2.1.6, but would, at this time, like to turn off the
dynamic load balancing.
>Could you please tell me just which lines need to be #if 0'd ?
diff -u Pound-2.1.6.orig/svc.c Pound-2.1.6/svc.c
--- Pound-2.1.6.orig/svc.c Sat Nov 4 11:28:55 2006
+++ Pound-2.1.6/svc.c Sat Nov 25 18:40:39 2006
(at)(at) -931,6 +931,7 (at)(at)
logmsg(LOG_WARNING, "thr_resurect() unlock: %s",
strerror(ret_val));
}
+#if 0
/* scale the back-end priorities */
for(lstn = listeners; lstn; lstn = lstn->next)
for(svc = lstn->services; svc; svc = svc->next) {
(at)(at) -1010,6 +1011,7 (at)(at)
if(ret_val = pthread_mutex_unlock(&svc->mut))
logmsg(LOG_WARNING, "thr_resurect() unlock: %s",
strerror(ret_val));
}
+#endif
}
}
Jacques.
|
|
|
Re: [Pound Mailing List] Uneven balancing with 2.1.6
Maurice Yarrow <yarrow(at)best.com> |
2006-11-25 20:41:14 |
[ SNIP ]
|
Jacques
Thanks...
Maurice
Jacques Caron wrote:
> Hi,
>
> At 05:19 24/11/2006, Maurice Yarrow wrote:
> So, I am using 2.1.6, but would, at this time, like to turn off the
> dynamic load balancing.
>
>> Could you please tell me just which lines need to be #if 0'd ?
>
>
> diff -u Pound-2.1.6.orig/svc.c Pound-2.1.6/svc.c
> --- Pound-2.1.6.orig/svc.c Sat Nov 4 11:28:55 2006
> +++ Pound-2.1.6/svc.c Sat Nov 25 18:40:39 2006
> (at)(at) -931,6 +931,7 (at)(at)
> logmsg(LOG_WARNING, "thr_resurect() unlock: %s",
> strerror(ret_val));
> }
>
> +#if 0
> /* scale the back-end priorities */
> for(lstn = listeners; lstn; lstn = lstn->next)
> for(svc = lstn->services; svc; svc = svc->next) {
> (at)(at) -1010,6 +1011,7 (at)(at)
> if(ret_val = pthread_mutex_unlock(&svc->mut))
> logmsg(LOG_WARNING, "thr_resurect() unlock: %s",
> strerror(ret_val));
> }
> +#endif
> }
> }
>
> Jacques.
>
>
|
|
|
|