|
/
Zope
/
Apsis
/
Pound Mailing List
/
Archive
/
2005
/
2005-11
/
[patch] Fetch configurable URL in alive-checking (1.9.4)
[
[patch] Stop pound from connecting to itself ... ]
[
Build pound without openssl / Arkadiusz Patyk ... ]
[patch] Fetch configurable URL in alive-checking (1.9.4)
Rune Saetre <rune.saetre(at)netcom-gsm.no> |
2005-11-05 08:55:43 |
[ SNIP ]
|
Hi
I wrote a patch to check for a configurable file on the BackEnd's as a way
of health checking.
The new global directive "AliveUri" specifies the file to look for. With
this patch Pound uses HEAD to check for this file on the BackEnd's every
"Alive" seconds. If anything else than 200 is returned the BackEnd is
considered down.
The code shouldn't do anything evil, and has worked correctly for me.
Most things are done in the new function check_alive_uri(), which could
be written somewhat more compact. Comments are welcome. I also had to
rearrange the alive testing loops a bit in order to preserve the expected
behaviour with the ha_port. This rearranging could also have been a bit
more compact. A rewrite of this would fit in with fixing multiple checks
of each host with priority higher than 1.
By the way, what happened to Joerg Wendland's patch for external health
checks? The nice feature of being able to disable a BackEnd on the load
balancer machine without having to restart pound is still desired!
Hope this can be of use to anyone!
Regards
Rune
---
Rune Sætre <rune.saetre(at)netcom-gsm.no>
NetCom as, Infrastruktur
Telefon (mob): 934 34 285
..
|
|
|
|
|
Re: [Pound Mailing List] [patch] Fetch configurable URL in alive-checking (1.9.4)
"Simon Matter" <simon.matter(at)ch.sauter-bc.com> |
2005-11-05 12:53:11 |
[ SNIP ]
|
> Hi
>
> I wrote a patch to check for a configurable file on the BackEnd's as a way
> of health checking.
Hi,
I don't see any patches attached, maybe they were stipped from the mail by
the list manager?
Simon
>
> The new global directive "AliveUri" specifies the file to look for. With
> this patch Pound uses HEAD to check for this file on the BackEnd's every
> "Alive" seconds. If anything else than 200 is returned the BackEnd is
> considered down.
>
> The code shouldn't do anything evil, and has worked correctly for me.
> Most things are done in the new function check_alive_uri(), which could
> be written somewhat more compact. Comments are welcome. I also had to
> rearrange the alive testing loops a bit in order to preserve the expected
> behaviour with the ha_port. This rearranging could also have been a bit
> more compact. A rewrite of this would fit in with fixing multiple checks
> of each host with priority higher than 1.
>
> By the way, what happened to Joerg Wendland's patch for external health
> checks? The nice feature of being able to disable a BackEnd on the load
> balancer machine without having to restart pound is still desired!
>
> Hope this can be of use to anyone!
>
> Regards
> Rune
>
> ---
> Rune Sætre <rune.saetre(at)netcom-gsm.no>
> NetCom as, Infrastruktur
> Telefon (mob): 934 34 285
> ..
>
> --
> To unsubscribe send an email with subject 'unsubscribe' to pound(at)apsis.ch.
> Please contact roseg(at)apsis.ch for questions.
> http://www.apsis.ch/pound/pound_list/archive/2005/2005-11/1131177343000
>
|
|
|
Re: [Pound Mailing List] [patch] Fetch configurable URL in alive-checking (1.9.4)
Robert Segall <roseg(at)apsis.ch> |
2005-11-05 15:23:27 |
[ SNIP ]
|
On Sat, 2005-11-05 at 12:53 +0100, Simon Matter wrote:
> > Hi
> >
> > I wrote a patch to check for a configurable file on the BackEnd's as a way
> > of health checking.
>
> Hi,
>
> I don't see any patches attached, maybe they were stipped from the mail by
> the list manager?
The patches are available in the on-line list archive _only_. After
yesterday's (inadvertent) DoS, where two messages caused a more than
half a gigabyte flooding, we have decided to strip all attachments from
outgoing mail.
As to the OP: why don't you try to integrate your proposed patch with
the Pound native mechanism (HAport)? It would make a very nice script
that most people are sure to appreciate.
--
Robert Segall
Apsis GmbH
Postfach, Uetikon am See, CH-8707
Tel: +41-44-920 4904
|
|
|
Re: [Pound Mailing List] [patch] Fetch configurable URL in alive-checking (1.9.4)
Sam Johnston <samjie(at)gmail.com> |
2005-11-05 17:39:03 |
[ SNIP ]
|
> > > I wrote a patch to check for a configurable file on the BackEnd's as a
way
> > > of health checking.
> >
> > Hi,
> >
> > I don't see any patches attached, maybe they were stipped from the mail by
> > the list manager?
>
> As to the OP: why don't you try to integrate your proposed patch with
> the Pound native mechanism (HAport)? It would make a very nice script
> that most people are sure to appreciate.
There's a million and one ways to monitor a backend server - adding a
config directive for each one of them isn't going to scale well. The
HAport mechanism is simple (and safe) enough, but to do anything
clever (like return some calculated value from a database to test all
tiers) you need code on your backend servers.
- samj
|
|
|
Re: [Pound Mailing List] [patch] Fetch configurable URL in alive-checking (1.9.4)
Rune Saetre <rune.saetre(at)netcom-gsm.no> |
2005-11-05 20:01:03 |
[ SNIP ]
|
Hi
>>>> I wrote a patch to check for a configurable file on the BackEnd's as a way
>>>> of health checking.
>>>
>>> Hi,
>>>
>>> I don't see any patches attached, maybe they were stipped from the mail by
>>> the list manager?
>>
>> As to the OP: why don't you try to integrate your proposed patch with
>> the Pound native mechanism (HAport)? It would make a very nice script
>> that most people are sure to appreciate.
>
> There's a million and one ways to monitor a backend server - adding a
> config directive for each one of them isn't going to scale well. The
> HAport mechanism is simple (and safe) enough, but to do anything
> clever (like return some calculated value from a database to test all
> tiers) you need code on your backend servers.
In my experience servers usually accept new TCP connections even when they
fail, but they do not return any data. This can be due to software errors,
database access problems, server overload and other things. If one can
fetch a file then the server must be processing requests. Usually
one can find a file that is a good indication of whether the server
working proberly.
We have a bunch of commersial load balancers of different makes, and a lot
of HTTP servers. This is the method we have settled for on the vast
majority of our servers.
Integrating this kind of health checking into pound makes sense to me
because pound already checks the servers, and it does an excellent job at
speaking with the backend servers already. As opposed to writing
the same thing outside of pound this reduces the required effort,
reduces complexity, is more portable and more reliable.
To prevent escalation Joerg Wendland's patch could be added as well.
Then all kinds of health checking can be done centrally, and gives a way
of temporarily disabling servers without having to reconfigure and restart
pound.
Regards
Rune
---
Rune Sætre <rune.saetre(at)netcom-gsm.no>
NetCom as, Infrastructure
..
|
|
|
Re: [patch] Fetch configurable URL in alive-checking (1.9.4)
Rune Saetre <rune.saetre(at)netcom-gsm.no> |
2005-11-06 00:58:22 |
[ SNIP ]
|
Hi again
Well, of course a tiny bug has sneaked in to this patch..
The new health checking isn't done unless a HA-port is specified.
I will have a look at this, and hopefully stop pound from repeating the
health checks "priority" times as well.
Rune
On Sat, 5 Nov 2005, Rune Saetre wrote:
> Hi
>
> I wrote a patch to check for a configurable file on the BackEnd's as a way of
> health checking.
>
> The new global directive "AliveUri" specifies the file to look for. With this
> patch Pound uses HEAD to check for this file on the BackEnd's every "Alive"
> seconds. If anything else than 200 is returned the BackEnd is considered
> down.
>
> The code shouldn't do anything evil, and has worked correctly for me. Most
> things are done in the new function check_alive_uri(), which could be written
> somewhat more compact. Comments are welcome. I also had to rearrange the
> alive testing loops a bit in order to preserve the expected behaviour with
> the ha_port. This rearranging could also have been a bit more compact. A
> rewrite of this would fit in with fixing multiple checks of each host with
> priority higher than 1.
>
> By the way, what happened to Joerg Wendland's patch for external health
> checks? The nice feature of being able to disable a BackEnd on the load
> balancer machine without having to restart pound is still desired!
>
> Hope this can be of use to anyone!
>
> Regards
> Rune
>
> ---
> Rune Sætre <rune.saetre(at)netcom-gsm.no>
> NetCom as, Infrastruktur
> Telefon (mob): 934 34 285
> ..
|
|
|
Re: [Pound Mailing List] [patch] Fetch configurable URL in alive-checking (1.9.4)
Robert Segall <roseg(at)apsis.ch> |
2005-11-07 13:50:58 |
[ SNIP ]
|
On Sat, 2005-11-05 at 20:01 +0100, Rune Saetre wrote:
> In my experience servers usually accept new TCP connections even when they
> fail, but they do not return any data. This can be due to software errors,
> database access problems, server overload and other things. If one can
> fetch a file then the server must be processing requests. Usually
> one can find a file that is a good indication of whether the server
> working proberly.
I can give you quite a few ways for servers to fail, even when returning
a fixed file fine. For example: you have an application that does some
stuff with a database, or fetches information from another web server.
It may very well return a static page just fine, but as an application
it is dead.
> We have a bunch of commersial load balancers of different makes, and a lot
> of HTTP servers. This is the method we have settled for on the vast
> majority of our servers.
>
> Integrating this kind of health checking into pound makes sense to me
> because pound already checks the servers, and it does an excellent job at
> speaking with the backend servers already. As opposed to writing
> the same thing outside of pound this reduces the required effort,
> reduces complexity, is more portable and more reliable.
It does not reduce the required effort, as each and every way of doing a
health check would have to be supported.
I can't really imagine how adding code would reduce complexity.
Portability would benefit how exactly?
> To prevent escalation Joerg Wendland's patch could be added as well.
> Then all kinds of health checking can be done centrally, and gives a way
> of temporarily disabling servers without having to reconfigure and restart
> pound.
Wendland's patch was rejected because it requires access to the file
system. Part of Pound's design is that it never, ever accesses the hard
disk after starting.
I suggest you submit a small stand-alone program to support your type of
health-check via the HAport mechanism. We would be happy to include it
as an example, or link to it from the Pound page.
Finally I think you should be working from the 2.0 code-base (which BTW
does not include the socket check you were objecting to; the check was
included as a work-around for an old Unix version and is probably no
longer required).
--
Robert Segall
Apsis GmbH
Postfach, Uetikon am See, CH-8707
Tel: +41-44-920 4904
|
|
|
Re: [Pound Mailing List] [patch] Fetch configurable URL in alive-checking (1.9.4)
Ximon Eighteen <ximon.eighteen(at)int.greenpeace.org> |
2005-11-07 14:09:48 |
[ SNIP ]
|
Robert Segall wrote:
> On Sat, 2005-11-05 at 20:01 +0100, Rune Saetre wrote:
>> In my experience servers usually accept new TCP connections even when they
>> fail, but they do not return any data. This can be due to software errors,
>> database access problems, server overload and other things. If one can
>> fetch a file then the server must be processing requests. Usually
>> one can find a file that is a good indication of whether the server
>> working proberly.
>
> I can give you quite a few ways for servers to fail, even when returning
> a fixed file fine. For example: you have an application that does some
> stuff with a database, or fetches information from another web server.
> It may very well return a static page just fine, but as an application
> it is dead.
>
>> We have a bunch of commersial load balancers of different makes, and a lot
>> of HTTP servers. This is the method we have settled for on the vast
>> majority of our servers.
>>
>> Integrating this kind of health checking into pound makes sense to me
>> because pound already checks the servers, and it does an excellent job at
>> speaking with the backend servers already. As opposed to writing
>> the same thing outside of pound this reduces the required effort,
>> reduces complexity, is more portable and more reliable.
>
> It does not reduce the required effort, as each and every way of doing a
> health check would have to be supported.
Sounds like a good case, if possible, for user contributed "plugins".
|
|
|
Re: [Pound Mailing List] [patch] Fetch configurable URL inalive-checking (1.9.4)
Rune Saetre <rune.saetre(at)netcom-gsm.no> |
2005-11-07 18:57:21 |
[ SNIP ]
|
Hi
I have done some modifications to the patch.
- ha_port does now not need to be specified in order for the AliveUri to
be used.
- I had forgotten to close the socket after connect if no AliveUri was
spefcified, so pound finally ran out of file descriptors.
- The AliveUri is now fetched using GET instead of HEAD. That makes a
better test, I think.
- Logs less
Complete tar archive of my current source directory can be found here
http://folesvaert.dyndns.org/~rst/pound_stuff/Pound-1.9.4-AliveUri_no_conn_self_20051107_020408.tgz
If you want to use the patch you probably want to use this.
Of course web servers can fail in numerous ways without all files becoming
unavailable. The trick lies in specifying a URL to test that most probably
will fail if the database or somethings becomes unavailable. Very often
such a URL can be found. As stated before, this works for us.
In our experience using more advanced health checking from the load
balancer usually increases complexity more than is desired.
Using the HAport feature is not an alternative for us, as about half of
our web servers are maintained by, and sometimes hosted by third party
vendors. We cannot install software on them, and in many cases we do not
even have login on the machines. The marketing department still like to
use these companies, because of their flashy webpages and their glossy
brochures more than functionality and stability i think...
When writing that putting this code in pound reduces effort and complexity
I was referring to the whole installation, and not pound alone, of course.
Writing standalone programs to run on the BackEnds doing the same checks
and bringing up and down the HAport yields more code. Setting it up on all
our BackEnds is a major undertaking and a pain to maintain. Adding part of
that code to pound, and one config directive at least saves me some work,
and reduces the complexity of our installation.
Portability. Since I have not used any library or system call in the new
code that is not already used in pound (with the possible exception of
strlen and strncmp) the patch should work on all platforms where pound is
already running. This way the health checking code doesn't have to be
ported to every platform the backends are running on.
I didn't know that Wendland's patch was rejected. But I completely agree
to the reason for it, of course. I still think it would be nice with an
interface for disabling servers from a central location, both as a means
of more advanced health checking and for holding back traffic for other
reasons, but this is not critical to us at least.
The patch was written for 1.9.4 because we're running 1.9.4 already, and
when testing/debugging I thought it best if only my bugs was in there. (v2
is still experimental.) From the man page Pound 2 seems promising. My
health checking will be ported to v2 as we start using it.
If there is any interrest for it I can post patches to the mailing list.
If not I will just tar down my source directory from time to time, and put
it on the web server mentioned above.
Regards
Rune
---
Rune Sætre <rune.saetre(at)netcom-gsm.no>
NetCom as, Infrastructure
..
On Mon, 7 Nov 2005, Robert Segall wrote:
> On Sat, 2005-11-05 at 20:01 +0100, Rune Saetre wrote:
>> In my experience servers usually accept new TCP connections even when they
>> fail, but they do not return any data. This can be due to software errors,
>> database access problems, server overload and other things. If one can
>> fetch a file then the server must be processing requests. Usually
>> one can find a file that is a good indication of whether the server
>> working proberly.
>
> I can give you quite a few ways for servers to fail, even when returning
> a fixed file fine. For example: you have an application that does some
> stuff with a database, or fetches information from another web server.
> It may very well return a static page just fine, but as an application
> it is dead.
>
>> We have a bunch of commersial load balancers of different makes, and a lot
>> of HTTP servers. This is the method we have settled for on the vast
>> majority of our servers.
>>
>> Integrating this kind of health checking into pound makes sense to me
>> because pound already checks the servers, and it does an excellent job at
>> speaking with the backend servers already. As opposed to writing
>> the same thing outside of pound this reduces the required effort,
>> reduces complexity, is more portable and more reliable.
>
> It does not reduce the required effort, as each and every way of doing a
> health check would have to be supported.
>
> I can't really imagine how adding code would reduce complexity.
>
> Portability would benefit how exactly?
>
>> To prevent escalation Joerg Wendland's patch could be added as well.
>> Then all kinds of health checking can be done centrally, and gives a way
>> of temporarily disabling servers without having to reconfigure and restart
>> pound.
>
> Wendland's patch was rejected because it requires access to the file
> system. Part of Pound's design is that it never, ever accesses the hard
> disk after starting.
>
> I suggest you submit a small stand-alone program to support your type of
> health-check via the HAport mechanism. We would be happy to include it
> as an example, or link to it from the Pound page.
>
> Finally I think you should be working from the 2.0 code-base (which BTW
> does not include the socket check you were objecting to; the check was
> included as a work-around for an old Unix version and is probably no
> longer required).
> --
> Robert Segall
> Apsis GmbH
> Postfach, Uetikon am See, CH-8707
> Tel: +41-44-920 4904
>
>
> --
> To unsubscribe send an email with subject 'unsubscribe' to pound(at)apsis.ch.
> Please contact roseg(at)apsis.ch for questions.
>
http://www.apsis.ch/pound/pound_list/archive/2005/2005-11/1131177343000/1131367858000
>
|
|
|
Re: [Pound Mailing List] [patch] Fetch configurable URL inalive-checking (1.9.4)
Ted Dunning <tdunning(at)veoh.com> |
2005-11-07 19:32:56 |
[ SNIP ]
|
Rune,
I think you are missing the point that the ha_port doesn't have to be on
the same machine as the back end.
The right place for your patch is as a small script that does the test
you want and which also accepts or doesn't accept ha_port connections
from pound. Since you control the machine that pound is running on, you
can easily run such a script.
Robert's position is that any code that doesn't have to be in pound
should not be in pound. This is a fundamentally sound position since
security is a key goal for pound. Putting your health checks into a
different process leaves the security of pound unaffected while still
satisfying all of your stated goals.
Rune Saetre wrote:
> Hi
>
> ... lots of discussion of the patch deleted ...
--
Ted Dunning
Chief Scientist
Veoh Networks
|
|
|
Re: [Pound Mailing List] [patch] Fetch configurable URL inalive-checking(1.9.4)
Rune Saetre <rune.saetre(at)netcom-gsm.no> |
2005-11-07 20:03:57 |
[ SNIP ]
|
Hi
If that is indeed the case, where do I specify the destination IP address
to accompany the HAport?
Rune
---
Rune Sætre <rune.saetre(at)netcom-gsm.no>
NetCom as, Infrastruktur
Telefon (mob): 934 34 285
..
On Mon, 7 Nov 2005, Ted Dunning wrote:
>
> Rune,
>
> I think you are missing the point that the ha_port doesn't have to be on the
> same machine as the back end.
>
> The right place for your patch is as a small script that does the test you
> want and which also accepts or doesn't accept ha_port connections from pound.
> Since you control the machine that pound is running on, you can easily run
> such a script.
>
> Robert's position is that any code that doesn't have to be in pound should
> not be in pound. This is a fundamentally sound position since security is a
> key goal for pound. Putting your health checks into a different process
> leaves the security of pound unaffected while still satisfying all of your
> stated goals.
>
> Rune Saetre wrote:
>
>> Hi
>>
>> ... lots of discussion of the patch deleted ...
>
>
> --
> Ted Dunning
> Chief Scientist
> Veoh Networks
>
>
>
|
|
|
Re: [Pound Mailing List] [patch] Fetch configurable URL inalive-checking(1.9.4)
Ted Dunning <tdunning(at)veoh.com> |
2005-11-08 02:02:30 |
[ SNIP ]
|
Rune,
Let me apologize. I was half right and half wrong.
The program responding to the ha_port doesn't have to be the web server
itself (that is the half right part). You are correct, however, that it
has to be on the same machine (that is my half wrong part).
I agree with you that there should be a better approach that allows you
to do ha_port-like signaling on a different machine. As a work-around,
you can chain point instances (pre 2.0) or listeners (2.0). That way,
the ha_port is on the same machine as the second pound instance so you
can install it. The actual requests get forwarded on to a final
destination.
This is a really ugly hack, though.
Your solution has the merit that an independent status URI could live
anywhere. Another approach would be to simple add an optional host
address to the ha_port specification. Specifying just a port would give
current behavior, adding host: before it would give the same behavior,
but the ha_port would be on the specified host.
Rune Saetre wrote:
> If that is indeed the case, where do I specify the destination IP
> address to accompany the HAport?
>
--
Ted Dunning
Chief Scientist
Veoh Networks
|
|
|
Re: [Pound Mailing List] [patch] Fetch configurable URL inalive-checking (1.9.4)
"Eric dai" <daibaoming(at)gmail.com> |
2005-11-08 03:35:48 |
[ SNIP ]
|
Finally ,where to get this patch?
----- Original Message -----
From: "Ted Dunning" <tdunning(at)veoh.com>
To: <pound(at)apsis.ch>
Cc: "Rune Saetre" <rune.saetre(at)netcom-gsm.no>
Sent: Tuesday, November 08, 2005 2:32 AM
Subject: Re: [Pound Mailing List] [patch] Fetch configurable URL
inalive-checking (1.9.4)
>
> Rune,
>
> I think you are missing the point that the ha_port doesn't have to be on
> the same machine as the back end.
>
> The right place for your patch is as a small script that does the test
> you want and which also accepts or doesn't accept ha_port connections
> from pound. Since you control the machine that pound is running on, you
> can easily run such a script.
>
> Robert's position is that any code that doesn't have to be in pound
> should not be in pound. This is a fundamentally sound position since
> security is a key goal for pound. Putting your health checks into a
> different process leaves the security of pound unaffected while still
> satisfying all of your stated goals.
>
> Rune Saetre wrote:
>
>> Hi
>>
>> ... lots of discussion of the patch deleted ...
>
>
> --
> Ted Dunning
> Chief Scientist
> Veoh Networks
>
>
>
> --
> To unsubscribe send an email with subject 'unsubscribe' to pound(at)apsis.ch.
> Please contact roseg(at)apsis.ch for questions.
>
http://www.apsis.ch/pound/pound_list/archive/2005/2005-11/1131177343000/1131388376000
|
|
|
Re: [Pound Mailing List] [patch] Fetch configurable URL inalive-checking (1.9.4)
Rune Saetre <rune.saetre(at)netcom-gsm.no> |
2005-11-08 04:18:26 |
[ SNIP ]
|
Hi
> From: "Eric dai"
> Finally ,where to get this patch?
The latest version of this is to be found here:
http://folesvaert.dyndns.org/~rst/pound_stuff/Pound-1.9.4-AliveUri_no_conn_self_20051107_020408.tgz
If you plan to use the patch you better use this, and not the one on the
mailing list, as it eats up all the file descriptors if no AliveUri is
specified. (Forgot to close socket after simple connect-test..)
There is an issue with not testing all instances when several webservers
are configured for the same IP address but with different ports. The
result is that servers aren't failed until accessed, but they are
resurrected just fine. I will have a look at it as soon as possible, and
put a fixed version at the same place.
Rune
---
Rune Sætre <rune.saetre(at)netcom-gsm.no>
NetCom as, Infrastructure
..
|
|
|
|