/ Zope / Apsis / Pound Mailing List / Archive / 2005 / 2005-11 / [patch] Fetch configurable URL in alive-checking (1.9.4)

[ << ] [ >> ]

[ [patch] Stop pound from connecting to itself ... ] [ Build pound without openssl / Arkadiusz Patyk ... ]

[patch] Fetch configurable URL in alive-checking (1.9.4)
Rune Saetre <rune.saetre(at)netcom-gsm.no>
2005-11-05 08:55:43 [ SNIP ]
Hi

I wrote a patch to check for a configurable file on the BackEnd's as a way 
of health checking.

The new global directive "AliveUri" specifies the file to look for. With 
this patch Pound uses HEAD to check for this file on the BackEnd's every 
"Alive" seconds. If anything else than 200 is returned the BackEnd is 
considered down.

The code shouldn't do anything evil, and has worked correctly for me. 
Most things are done in the new function check_alive_uri(), which could 
be written somewhat more compact. Comments are welcome. I also had to 
rearrange the alive testing loops a bit in order to preserve the expected 
behaviour with the ha_port. This rearranging could also have been a bit 
more compact. A rewrite of this would fit in with fixing multiple checks 
of each host with priority higher than 1.

By the way, what happened to Joerg Wendland's patch for external health 
checks? The nice feature of being able to disable a BackEnd on the load 
balancer machine without having to restart pound is still desired!

Hope this can be of use to anyone!

Regards
Rune

---
Rune Sætre <rune.saetre(at)netcom-gsm.no>
NetCom as, Infrastruktur
Telefon (mob): 934 34 285
..
Attachments:  
patch_pound194_alive_uri.diff.gz application/octet-stream 3667 Bytes

Re: [Pound Mailing List] [patch] Fetch configurable URL in alive-checking (1.9.4)
"Simon Matter" <simon.matter(at)ch.sauter-bc.com>
2005-11-05 12:53:11 [ SNIP ]
> Hi
>
> I wrote a patch to check for a configurable file on the BackEnd's as a way
> of health checking.

Hi,

I don't see any patches attached, maybe they were stipped from the mail by
the list manager?

Simon

>
> The new global directive "AliveUri" specifies the file to look for. With
> this patch Pound uses HEAD to check for this file on the BackEnd's every
> "Alive" seconds. If anything else than 200 is returned the BackEnd is
> considered down.
>
> The code shouldn't do anything evil, and has worked correctly for me.
> Most things are done in the new function check_alive_uri(), which could
> be written somewhat more compact. Comments are welcome. I also had to
> rearrange the alive testing loops a bit in order to preserve the expected
> behaviour with the ha_port. This rearranging could also have been a bit
> more compact. A rewrite of this would fit in with fixing multiple checks
> of each host with priority higher than 1.
>
> By the way, what happened to Joerg Wendland's patch for external health
> checks? The nice feature of being able to disable a BackEnd on the load
> balancer machine without having to restart pound is still desired!
>
> Hope this can be of use to anyone!
>
> Regards
> Rune
>
> ---
> Rune Sætre <rune.saetre(at)netcom-gsm.no>
> NetCom as, Infrastruktur
> Telefon (mob): 934 34 285
> ..
>
> --
> To unsubscribe send an email with subject 'unsubscribe' to pound(at)apsis.ch.
> Please contact roseg(at)apsis.ch for questions.
> http://www.apsis.ch/pound/pound_list/archive/2005/2005-11/1131177343000
>

Re: [Pound Mailing List] [patch] Fetch configurable URL in alive-checking (1.9.4)
Robert Segall <roseg(at)apsis.ch>
2005-11-05 15:23:27 [ SNIP ]
On Sat, 2005-11-05 at 12:53 +0100, Simon Matter wrote:
> > Hi
> >
> > I wrote a patch to check for a configurable file on the BackEnd's as a way
> > of health checking.
> 
> Hi,
> 
> I don't see any patches attached, maybe they were stipped from the mail by
> the list manager?

The patches are available in the on-line list archive _only_. After
yesterday's (inadvertent) DoS, where two messages caused a more than
half a gigabyte flooding, we have decided to strip all attachments from
outgoing mail.

As to the OP: why don't you try to integrate your proposed patch with
the Pound native mechanism (HAport)? It would make a very nice script
that most people are sure to appreciate.
-- 
Robert Segall
Apsis GmbH
Postfach, Uetikon am See, CH-8707
Tel: +41-44-920 4904


Re: [Pound Mailing List] [patch] Fetch configurable URL in alive-checking (1.9.4)
Sam Johnston <samjie(at)gmail.com>
2005-11-05 17:39:03 [ SNIP ]
> > > I wrote a patch to check for a configurable file on the BackEnd's as a
way
> > > of health checking.
> >
> > Hi,
> >
> > I don't see any patches attached, maybe they were stipped from the mail by
> > the list manager?
>
> As to the OP: why don't you try to integrate your proposed patch with
> the Pound native mechanism (HAport)? It would make a very nice script
> that most people are sure to appreciate.

There's a million and one ways to monitor a backend server - adding a
config directive for each one of them isn't going to scale well. The
HAport mechanism is simple (and safe) enough, but to do anything
clever (like return some calculated value from a database to test all
tiers) you need code on your backend servers.

 - samj

Re: [Pound Mailing List] [patch] Fetch configurable URL in alive-checking (1.9.4)
Rune Saetre <rune.saetre(at)netcom-gsm.no>
2005-11-05 20:01:03 [ SNIP ]
Hi

>>>> I wrote a patch to check for a configurable file on the BackEnd's as a way
>>>> of health checking.
>>>
>>> Hi,
>>>
>>> I don't see any patches attached, maybe they were stipped from the mail by
>>> the list manager?
>>
>> As to the OP: why don't you try to integrate your proposed patch with
>> the Pound native mechanism (HAport)? It would make a very nice script
>> that most people are sure to appreciate.
>
> There's a million and one ways to monitor a backend server - adding a
> config directive for each one of them isn't going to scale well. The
> HAport mechanism is simple (and safe) enough, but to do anything
> clever (like return some calculated value from a database to test all
> tiers) you need code on your backend servers.

In my experience servers usually accept new TCP connections even when they 
fail, but they do not return any data. This can be due to software errors, 
database access problems, server overload and other things. If one can 
fetch a file then the server must be processing requests.  Usually 
one can find a file that is a good indication of whether the server 
working proberly.

We have a bunch of commersial load balancers of different makes, and a lot 
of HTTP servers. This is the method we have settled for on the vast 
majority of our servers.

Integrating this kind of health checking into pound makes sense to me 
because pound already checks the servers, and it does an excellent job at 
speaking with the backend servers already. As opposed to writing 
the same thing outside of pound this reduces the required effort, 
reduces complexity, is more portable and more reliable.

To prevent escalation Joerg Wendland's patch could be added as well.
Then all kinds of health checking can be done centrally, and gives a way 
of temporarily disabling servers without having to reconfigure and restart 
pound.

Regards
Rune

---
Rune Sætre <rune.saetre(at)netcom-gsm.no>
NetCom as, Infrastructure
..


Re: [patch] Fetch configurable URL in alive-checking (1.9.4)
Rune Saetre <rune.saetre(at)netcom-gsm.no>
2005-11-06 00:58:22 [ SNIP ]
Hi again

Well, of course a tiny bug has sneaked in to this patch..
The new health checking isn't done unless a HA-port is specified.

I will have a look at this, and hopefully stop pound from repeating the 
health checks "priority" times as well.

Rune


On Sat, 5 Nov 2005, Rune Saetre wrote:

> Hi
>
> I wrote a patch to check for a configurable file on the BackEnd's as a way of

> health checking.
>
> The new global directive "AliveUri" specifies the file to look for. With this

> patch Pound uses HEAD to check for this file on the BackEnd's every "Alive" 
> seconds. If anything else than 200 is returned the BackEnd is considered 
> down.
>
> The code shouldn't do anything evil, and has worked correctly for me. Most 
> things are done in the new function check_alive_uri(), which could be written

> somewhat more compact. Comments are welcome. I also had to rearrange the 
> alive testing loops a bit in order to preserve the expected behaviour with 
> the ha_port. This rearranging could also have been a bit more compact. A 
> rewrite of this would fit in with fixing multiple checks of each host with 
> priority higher than 1.
>
> By the way, what happened to Joerg Wendland's patch for external health 
> checks? The nice feature of being able to disable a BackEnd on the load 
> balancer machine without having to restart pound is still desired!
>
> Hope this can be of use to anyone!
>
> Regards
> Rune
>
> ---
> Rune Sætre <rune.saetre(at)netcom-gsm.no>
> NetCom as, Infrastruktur
> Telefon (mob): 934 34 285
> ..

Re: [Pound Mailing List] [patch] Fetch configurable URL in alive-checking (1.9.4)
Robert Segall <roseg(at)apsis.ch>
2005-11-07 13:50:58 [ SNIP ]
On Sat, 2005-11-05 at 20:01 +0100, Rune Saetre wrote:
> In my experience servers usually accept new TCP connections even when they 
> fail, but they do not return any data. This can be due to software errors, 
> database access problems, server overload and other things. If one can 
> fetch a file then the server must be processing requests.  Usually 
> one can find a file that is a good indication of whether the server 
> working proberly.

I can give you quite a few ways for servers to fail, even when returning
a fixed file fine. For example: you have an application that does some
stuff with a database, or fetches information from another web server.
It may very well return a static page just fine, but as an application
it is dead.

> We have a bunch of commersial load balancers of different makes, and a lot 
> of HTTP servers. This is the method we have settled for on the vast 
> majority of our servers.
> 
> Integrating this kind of health checking into pound makes sense to me 
> because pound already checks the servers, and it does an excellent job at 
> speaking with the backend servers already. As opposed to writing 
> the same thing outside of pound this reduces the required effort, 
> reduces complexity, is more portable and more reliable.

It does not reduce the required effort, as each and every way of doing a
health check would have to be supported.

I can't really imagine how adding code would reduce complexity.

Portability would benefit how exactly?

> To prevent escalation Joerg Wendland's patch could be added as well.
> Then all kinds of health checking can be done centrally, and gives a way 
> of temporarily disabling servers without having to reconfigure and restart 
> pound.

Wendland's patch was rejected because it requires access to the file
system. Part of Pound's design is that it never, ever accesses the hard
disk after starting.

I suggest you submit a small stand-alone program to support your type of
health-check via the HAport mechanism. We would be happy to include it
as an example, or link to it from the Pound page.

Finally I think you should be working from the 2.0 code-base (which BTW
does not include the socket check you were objecting to; the check was
included as a work-around for an old Unix version and is probably no
longer required).
-- 
Robert Segall
Apsis GmbH
Postfach, Uetikon am See, CH-8707
Tel: +41-44-920 4904


Re: [Pound Mailing List] [patch] Fetch configurable URL in alive-checking (1.9.4)
Ximon Eighteen <ximon.eighteen(at)int.greenpeace.org>
2005-11-07 14:09:48 [ SNIP ]
Robert Segall wrote:
> On Sat, 2005-11-05 at 20:01 +0100, Rune Saetre wrote:
>> In my experience servers usually accept new TCP connections even when they 
>> fail, but they do not return any data. This can be due to software errors, 
>> database access problems, server overload and other things. If one can 
>> fetch a file then the server must be processing requests.  Usually 
>> one can find a file that is a good indication of whether the server 
>> working proberly.
> 
> I can give you quite a few ways for servers to fail, even when returning
> a fixed file fine. For example: you have an application that does some
> stuff with a database, or fetches information from another web server.
> It may very well return a static page just fine, but as an application
> it is dead.
> 
>> We have a bunch of commersial load balancers of different makes, and a lot 
>> of HTTP servers. This is the method we have settled for on the vast 
>> majority of our servers.
>>
>> Integrating this kind of health checking into pound makes sense to me 
>> because pound already checks the servers, and it does an excellent job at 
>> speaking with the backend servers already. As opposed to writing 
>> the same thing outside of pound this reduces the required effort, 
>> reduces complexity, is more portable and more reliable.
> 
> It does not reduce the required effort, as each and every way of doing a
> health check would have to be supported.

Sounds like a good case, if possible, for user contributed "plugins".


Re: [Pound Mailing List] [patch] Fetch configurable URL inalive-checking (1.9.4)
Rune Saetre <rune.saetre(at)netcom-gsm.no>
2005-11-07 18:57:21 [ SNIP ]
Hi

I have done some modifications to the patch.
  - ha_port does now not need to be specified in order for the AliveUri to
    be used.
  - I had forgotten to close the socket after connect if no AliveUri was
    spefcified, so pound finally ran out of file descriptors.
  - The AliveUri is now fetched using GET instead of HEAD. That makes a
    better test, I think.
  - Logs less
Complete tar archive of my current source directory can be found here
  
http://folesvaert.dyndns.org/~rst/pound_stuff/Pound-1.9.4-AliveUri_no_conn_self_20051107_020408.tgz
If you want to use the patch you probably want to use this.

Of course web servers can fail in numerous ways without all files becoming 
unavailable. The trick lies in specifying a URL to test that most probably 
will fail if the database or somethings becomes unavailable. Very often 
such a URL can be found. As stated before, this works for us.
In our experience using more advanced health checking from the load 
balancer usually increases complexity more than is desired.

Using the HAport feature is not an alternative for us, as about half of 
our web servers are maintained by, and sometimes hosted by third party 
vendors. We cannot install software on them, and in many cases we do not 
even have login on the machines. The marketing department still like to 
use these companies, because of their flashy webpages and their glossy 
brochures more than functionality and stability i think...

When writing that putting this code in pound reduces effort and complexity 
I was referring to the whole installation, and not pound alone, of course.
Writing standalone programs to run on the BackEnds doing the same checks 
and bringing up and down the HAport yields more code. Setting it up on all 
our BackEnds is a major undertaking and a pain to maintain. Adding part of 
that code to pound, and one config directive at least saves me some work, 
and reduces the complexity of our installation.

Portability. Since I have not used any library or system call in the new 
code that is not already used in pound (with the possible exception of 
strlen and strncmp) the patch should work on all platforms where pound is 
already running. This way the health checking code doesn't have to be 
ported to every platform the backends are running on.

I didn't know that Wendland's patch was rejected. But I completely agree 
to the reason for it, of course. I still think it would be nice with an 
interface for disabling servers from a central location, both as a means 
of more advanced health checking and for holding back traffic for other 
reasons, but this is not critical to us at least.

The patch was written for 1.9.4 because we're running 1.9.4 already, and 
when testing/debugging I thought it best if only my bugs was in there. (v2 
is still experimental.) From the man page Pound 2 seems promising. My 
health checking will be ported to v2 as we start using it.

If there is any interrest for it I can post patches to the mailing list. 
If not I will just tar down my source directory from time to time, and put 
it on the web server mentioned above.

Regards
Rune

---
Rune Sætre <rune.saetre(at)netcom-gsm.no>
NetCom as, Infrastructure
..

On Mon, 7 Nov 2005, Robert Segall wrote:

> On Sat, 2005-11-05 at 20:01 +0100, Rune Saetre wrote:
>> In my experience servers usually accept new TCP connections even when they
>> fail, but they do not return any data. This can be due to software errors,
>> database access problems, server overload and other things. If one can
>> fetch a file then the server must be processing requests.  Usually
>> one can find a file that is a good indication of whether the server
>> working proberly.
>
> I can give you quite a few ways for servers to fail, even when returning
> a fixed file fine. For example: you have an application that does some
> stuff with a database, or fetches information from another web server.
> It may very well return a static page just fine, but as an application
> it is dead.
>
>> We have a bunch of commersial load balancers of different makes, and a lot
>> of HTTP servers. This is the method we have settled for on the vast
>> majority of our servers.
>>
>> Integrating this kind of health checking into pound makes sense to me
>> because pound already checks the servers, and it does an excellent job at
>> speaking with the backend servers already. As opposed to writing
>> the same thing outside of pound this reduces the required effort,
>> reduces complexity, is more portable and more reliable.
>
> It does not reduce the required effort, as each and every way of doing a
> health check would have to be supported.
>
> I can't really imagine how adding code would reduce complexity.
>
> Portability would benefit how exactly?
>
>> To prevent escalation Joerg Wendland's patch could be added as well.
>> Then all kinds of health checking can be done centrally, and gives a way
>> of temporarily disabling servers without having to reconfigure and restart
>> pound.
>
> Wendland's patch was rejected because it requires access to the file
> system. Part of Pound's design is that it never, ever accesses the hard
> disk after starting.
>
> I suggest you submit a small stand-alone program to support your type of
> health-check via the HAport mechanism. We would be happy to include it
> as an example, or link to it from the Pound page.
>
> Finally I think you should be working from the 2.0 code-base (which BTW
> does not include the socket check you were objecting to; the check was
> included as a work-around for an old Unix version and is probably no
> longer required).
> -- 
> Robert Segall
> Apsis GmbH
> Postfach, Uetikon am See, CH-8707
> Tel: +41-44-920 4904
>
>
> -- 
> To unsubscribe send an email with subject 'unsubscribe' to pound(at)apsis.ch.
> Please contact roseg(at)apsis.ch for questions.
>
http://www.apsis.ch/pound/pound_list/archive/2005/2005-11/1131177343000/1131367858000
>

Re: [Pound Mailing List] [patch] Fetch configurable URL inalive-checking (1.9.4)
Ted Dunning <tdunning(at)veoh.com>
2005-11-07 19:32:56 [ SNIP ]
Rune,

I think you are missing the point that the ha_port doesn't have to be on 
the same machine as the back end.

The right place for your patch is as a small script that does the test 
you want and which also accepts or doesn't accept ha_port connections 
from pound.  Since you control the machine that pound is running on, you 
can easily run such a script.

Robert's position is that any code that doesn't have to be in pound 
should not be in pound.  This is a fundamentally sound position since 
security is a key goal for pound.  Putting your health checks into a 
different process leaves the security of pound unaffected while still 
satisfying all of your stated goals.

Rune Saetre wrote:

> Hi
>
> ... lots of discussion of the patch deleted ...


-- 
Ted Dunning
Chief Scientist
Veoh Networks



Re: [Pound Mailing List] [patch] Fetch configurable URL inalive-checking(1.9.4)
Rune Saetre <rune.saetre(at)netcom-gsm.no>
2005-11-07 20:03:57 [ SNIP ]
Hi

If that is indeed the case, where do I specify the destination IP address 
to accompany the HAport?

Rune

---
Rune Sætre <rune.saetre(at)netcom-gsm.no>
NetCom as, Infrastruktur
Telefon (mob): 934 34 285
..

On Mon, 7 Nov 2005, Ted Dunning wrote:

>
> Rune,
>
> I think you are missing the point that the ha_port doesn't have to be on the 
> same machine as the back end.
>
> The right place for your patch is as a small script that does the test you 
> want and which also accepts or doesn't accept ha_port connections from pound.

> Since you control the machine that pound is running on, you can easily run 
> such a script.
>
> Robert's position is that any code that doesn't have to be in pound should 
> not be in pound.  This is a fundamentally sound position since security is a 
> key goal for pound.  Putting your health checks into a different process 
> leaves the security of pound unaffected while still satisfying all of your 
> stated goals.
>
> Rune Saetre wrote:
>
>> Hi
>> 
>> ... lots of discussion of the patch deleted ...
>
>
> -- 
> Ted Dunning
> Chief Scientist
> Veoh Networks
>
>
>

Re: [Pound Mailing List] [patch] Fetch configurable URL inalive-checking(1.9.4)
Ted Dunning <tdunning(at)veoh.com>
2005-11-08 02:02:30 [ SNIP ]
Rune,

Let me apologize.  I was half right and half wrong.

The program responding to the ha_port doesn't have to be the web server 
itself (that is the half right part).  You are correct, however, that it 
has to be on the same machine (that is my half wrong part).

I agree with you that there should be a better approach that allows you 
to do ha_port-like signaling on a different machine.  As a work-around, 
you can chain point instances (pre 2.0) or listeners (2.0).  That way, 
the ha_port is on the same machine as the second pound instance so you 
can install it.  The actual requests get forwarded on to a final 
destination.

This is a really ugly hack, though. 

Your solution has the merit that an independent status URI could live 
anywhere.  Another approach would be to simple add an optional host 
address to the ha_port specification.  Specifying just a port would give 
current behavior, adding host: before it would give the same behavior, 
but the ha_port would be on the specified host.



Rune Saetre wrote:

> If that is indeed the case, where do I specify the destination IP 
> address to accompany the HAport?
>


-- 
Ted Dunning
Chief Scientist
Veoh Networks



Re: [Pound Mailing List] [patch] Fetch configurable URL inalive-checking (1.9.4)
"Eric dai" <daibaoming(at)gmail.com>
2005-11-08 03:35:48 [ SNIP ]
Finally ,where to get this patch?
----- Original Message ----- 
From: "Ted Dunning" <tdunning(at)veoh.com>
To: <pound(at)apsis.ch>
Cc: "Rune Saetre" <rune.saetre(at)netcom-gsm.no>
Sent: Tuesday, November 08, 2005 2:32 AM
Subject: Re: [Pound Mailing List] [patch] Fetch configurable URL
inalive-checking (1.9.4)


> 
> Rune,
> 
> I think you are missing the point that the ha_port doesn't have to be on 
> the same machine as the back end.
> 
> The right place for your patch is as a small script that does the test 
> you want and which also accepts or doesn't accept ha_port connections 
> from pound.  Since you control the machine that pound is running on, you 
> can easily run such a script.
> 
> Robert's position is that any code that doesn't have to be in pound 
> should not be in pound.  This is a fundamentally sound position since 
> security is a key goal for pound.  Putting your health checks into a 
> different process leaves the security of pound unaffected while still 
> satisfying all of your stated goals.
> 
> Rune Saetre wrote:
> 
>> Hi
>>
>> ... lots of discussion of the patch deleted ...
> 
> 
> -- 
> Ted Dunning
> Chief Scientist
> Veoh Networks
> 
> 
> 
> -- 
> To unsubscribe send an email with subject 'unsubscribe' to pound(at)apsis.ch.
> Please contact roseg(at)apsis.ch for questions.
>
http://www.apsis.ch/pound/pound_list/archive/2005/2005-11/1131177343000/1131388376000

Re: [Pound Mailing List] [patch] Fetch configurable URL inalive-checking (1.9.4)
Rune Saetre <rune.saetre(at)netcom-gsm.no>
2005-11-08 04:18:26 [ SNIP ]
Hi

> From: "Eric dai"
> Finally ,where to get this patch?

The latest version of this is to be found here:
  
http://folesvaert.dyndns.org/~rst/pound_stuff/Pound-1.9.4-AliveUri_no_conn_self_20051107_020408.tgz

If you plan to use the patch you better use this, and not the one on the 
mailing list, as it eats up all the file descriptors if no AliveUri is 
specified. (Forgot to close socket after simple connect-test..)

There is an issue with not testing all instances when several webservers 
are configured for the same IP address but with different ports. The 
result is that servers aren't failed until accessed, but they are 
resurrected just fine. I will have a look at it as soon as possible, and 
put a fixed version at the same place.

Rune

---
Rune Sætre <rune.saetre(at)netcom-gsm.no>
NetCom as, Infrastructure
..


MailBoxer