|
/
Zope
/
Apsis
/
Pound Mailing List
/
Archive
/
2006
/
2006-03
/
"no back-end"
[
LISTENER->clnt_check never set / Sean Gabriel ... ]
[
Backend Timeout / Tobias Brox ... ]
"no back-end"
Tobias Brox <tobias(at)nordicbet.com> |
2006-03-26 05:09:32 |
[ FULL ]
|
We get the "no back-end" error message for about 0.4% of our requests.
According to my investigation, we haven't had any incidents where all the
backends have been down.
I investigated a bit and found this piece of code:
rand_backend(BACKEND *be, int pri)
{
while(be) {
if(!be->alive) {
be = be->next;
continue;
}
if((pri -= be->priority) < 0)
break;
be = be->next;
}
return be;
}
And this function is typically called like:
res = rand_backend(svc->backends, random() % svc->tot_pri);
Now, please prove me wrong - but I think there is a bug here. Let's say
there are two backends, both with priority 1 - and I suppose svc->tot_pri
would be 2 then, so the random input parameter pri will be 0 or 1.
Normally, when pri == 0 it will chose the first backend, and if pri == 1, it
will choose the second backend. But now, what happens in the code above, if
the first back-end is down, the second back-end is up, and pri == 1?
[...]
|
|
|
Re: "no back-end"
Tobias Brox <tobias(at)nordicbet.com> |
2006-03-26 05:19:47 |
[ FULL ]
|
[Tobias Brox - Sun at 05:09:32AM +0200][...]
Sorry, seems like I'm mistaken:
/*
* mark a backend host as dead; remove its sessions
*/
void
kill_be(SERVICE *svc, BACKEND *be)
{
(...)
for(b = svc->backends; b; b = b->next) {
(...)
if(b->alive)
svc->tot_pri += b->priority;
}
(...)
}
Anyone else experiencing 503s and unexpected "no backend" error messages?
We're running 1.8 in production, maybe problems are fixed in 2.0.3?
[...]
|
|
|
Re: "no back-end"
Tobias Brox <tobias(at)nordicbet.com> |
2006-03-27 15:20:24 |
[ FULL ]
|
Just an update on this issue.
I was investigating version 2.0.3 of pound, though we're using version 1.8
in our production setup. My project leader checked up and found
that there is a bug in 1.8, occationally causing the "no backend" error when
a backend goes down while pound is still attempting to send requests there.
After investigating the code, I'm quite sure there shouldn't be issues with
2.0.3, but anyway:
[...]
As I found, the while-loop above will never come to the end, because the
input parameter pri is always expected to have lower than the sum of the
priority of all live servers. Anyway, I would recommend to throw in a
sanity test right before the return, logging an error if be is NULL.
[...]
|
|
|
|