/ Zope / Apsis / Pound Mailing List / Archive / 2005 / 2005-12 / Bugreport for Pound-2.0b4

[ << ] [ >> ]

[ Pound 2.x status / Tom Pike ... ] [ Transparent proxying / =?ISO-8859-2?Q?Ivancs=F3_Kr... ]

Bugreport for Pound-2.0b4
"Simon Matter" <simon.matter(at)ch.sauter-bc.com>
2005-12-12 10:43:52 [ FULL ]
I have switched three production servers to Pound-2.0b4 last friday so I
can give some feedback. This test was almost successful so far with two
problems showing up in my case:

1) Requests which take some time on the backend (generating images from
data in a SQL database) get a timeout looking like this:
Dec  9 20:25:33 sg-01 pound: response error read from 10.1.6.6:80:
Connection timed out
This works fine with 1.9.5, and I didn't set any timeouts in pound.cfg.

2) In a configuration with only one backend, if the backend fails for some
moments (for example restarting apache), I get a '503 Service Unavailable'
even when the backend is already available again. This is what shows up in
the logs:
Dec 12 09:32:55 mx-05-bsl pound: no back-end "GET
/zabbix/charts.php?graphid=2&keep=1 HTTP/1.1" from 10.0.4.2

A sample of a used configuration is attached.
I didn't have any of the mentioned problems with 1.9.x.

BTW: 1.x versions of Pound leaked memory in my case so I was forced to
restart it daily. The 2.0 code seems to fix it. I'll go back to 1.9.5 for
now and check if it also doesn't leak now.

I hope my bugreport is useful.

Regards,
Simon
Attachments:  
pound.cfg.gz application/x-gzip 442 Bytes

Re: [Pound Mailing List] Bugreport for Pound-2.0b4
Robert Segall <roseg(at)apsis.ch>
2005-12-12 13:06:52 [ FULL ]
On Mon, 2005-12-12 at 10:43 +0100, Simon Matter wrote:[...]

Many thanks for the effort - this is much appreciated.
[...]

This is very puzzling, as the code that deals with it should be
identical with the 1.9 series. Could you try to add a timeout just to
see if it helps?
[...]

This could happen even on the 1.9 series. If a request happens to come
in just as the back-end is down the back-end is considered dead and
subsequent requests are rejected. However the back-end should be
resurrected after a short while automatically. Please wait about a
minute (the default Alive value is 30 seconds, but you can change that)
and check if it comes back.
[...]

Do let us know.
[...]

Very much so - keep up the good work.

Many thanks again.[...]

Re: [Pound Mailing List] Bugreport for Pound-2.0b4
"Simon Matter" <simon.matter(at)ch.sauter-bc.com>
2005-12-14 14:13:28 [ FULL ]
>>> 1) Requests which take some time on the backend (generating images
from[...][...]

While looking at the code (warning: I have never written anything in C) it
seems that something has changed anyway.
I first tried to set TimeOut to 0, which according to the man page is (and
was) the default. But, Pound doesn't accept 'TimeOut 0', and in config.c I
see that timeout is now set to 15 by default. Is this intended. If yes,
and to can never be < 1, then the code in http.c doesn't make any sense:
...
BIO_set_close(be, BIO_CLOSE);
if(backend->to > 0) {
    BIO_set_callback_arg(be, (char *)&backend->to);
    BIO_set_callback(be, bio_callback);
}
...

And then, there is some logic in the 1.9.5 code which I couldn't find in
the new code:
#define SERVER_TO   (server_to > 0? server_to: 5)

I really tried hard to understand all the code where to is used, but
without success. I first came up with attached patch
Pound-2.0b4-timeout.patch but it seems not to fix it (of course because I
even don't know if to=0 should ever be possible).
[...][...][...]

The backend never comes back, at least not after hours.
With the timeout patch above, and no TimeOut setting in the config which
then means to=0, The timeout with slow pages doesn't happen again. But
when I stop and start the webserver, the backend still never comes back. I
still have the feeling that both problems are related somehow.
[...][...][...]

At least in the short time I'm running the 1.9 code again, it seems not to
grow anymore.

One more thing: the man page is not consistent with the code with the HA
options. Attached patch Pound-2.0b4-haportaddr.patch may fix the docs, or
the code needs to get fixed.

I'm sorry that I can't offer a fix to the problems reported. I have spent
hours but I just don't really understand what I'm doing.

Regards,
Simon
Attachments:  
Pound-2.0b4-timeout.patch text/x-patch 1297 Bytes
Pound-2.0b4-haportaddr.patch application/octet-stream 0 Bytes

Re: [Pound Mailing List] Bugreport for Pound-2.0b4
Robert Segall <roseg(at)apsis.ch>
2005-12-15 13:25:05 [ FULL ]
On Wed, 2005-12-14 at 14:13 +0100, Simon Matter wrote:[...]

See? I told you it is the same code!

Seriously - it looks like someone changed the code and forgot to update
the man page. We'll check that.
[...]

This is for connection checking (is_readable), where you don't want the
thread to hang forever if the back-end is down. Don't worry about it.
[...]

Thanks for letting us know - we'll look into it.
[...]

The patch is empty. Could you please try again?
[...][...]

Re: [Pound Mailing List] Bugreport for Pound-2.0b4
Robert Segall <roseg(at)apsis.ch>
2005-12-15 14:48:44 [ FULL ]
On Wed, 2005-12-14 at 14:13 +0100, Simon Matter wrote:[...]

Please try compiling with the attached svc.c - I believe it should help.

You should not use a timeout of 0, as that is a deprecated option, and
will vanish from the man page as well. Just set something which makes
sense for your setup.[...]
Attachments:  
svc.c text/x-csrc 28753 Bytes

Re: [Pound Mailing List] Bugreport for Pound-2.0b4
"Simon Matter" <simon.matter(at)ch.sauter-bc.com>
2005-12-16 10:52:55 [ FULL ]
>> One more thing: the man page is not consistent with the code with the
HA[...][...]

Okay, next try.

Simon
Attachments:  
Pound-2.0b4-haportaddr.patch.gz application/x-gzip 715 Bytes

Re: [Pound Mailing List] Bugreport for Pound-2.0b4
"Simon Matter" <simon.matter(at)ch.sauter-bc.com>
2005-12-16 11:20:33 [ FULL ]
> On Wed, 2005-12-14 at 14:13 +0100, Simon Matter wrote:[...][...]

Okay, it seems to work fine now. I have set TimeOut to a large number and
it works perfect, and also a restarted backend resurrects without
problems.

Now my only question is wouldn't it make sense to have a global default
TimeOut parameter so we don't have to set it for all backends. If time
could be set in the Global and/or the Backend Session, this would be fine.
What do you mean?

Thanks,
Simon

Re: [Pound Mailing List] Bugreport for Pound-2.0b4
"Simon Matter" <simon.matter(at)ch.sauter-bc.com>
2005-12-16 11:34:24 [ FULL ]
>> On Wed, 2005-12-14 at 14:13 +0100, Simon Matter wrote:
>>> The backend never comes back, at least not after hours.
>>> With the timeout patch above, and no TimeOut setting in the config
>>> which
>>> then means to˙ The timeout with slow pages doesn't happen again.
But
>>> when I stop and start the webserver, the backend still never comes
>>> back.
>>> I
>>> still have the feeling that both problems are related
somehow.[...][...]

Forgot the patch for the man page.
Attachments:  
Pound-2.0b4-timeout.patch text/x-patch 508 Bytes

Re: [Pound Mailing List] Bugreport for Pound-2.0b4
Robert Segall <roseg(at)apsis.ch>
2005-12-16 13:01:03 [ FULL ]
On Fri, 2005-12-16 at 11:20 +0100, Simon Matter wrote:[...]

Great. Thanks for letting us know.
[...]

There is a default value, which is hopefully a good compromise. I would
be happy to change the default if somebody comes up with a good
suggestion. 15 seconds seems OK for the vast majority of servers, which
are quite lightly loaded - we could probably use something much lower,
like 5, without any adverse effects.

Beyond that I can't really see how a global value would make sense -
after all each back-end is different, and probably needs a different
time-out.[...]

RE: [Pound Mailing List] Bugreport for Pound-2.0b4
"Silvio Bierman" <sbierman(at)jambo-software.com>
2005-12-16 13:23:59 [ FULL ]
(at) -----Original Message-----
(at) From: Robert Segall [mailto:roseg(at)apsis.ch]
(at) Sent: 16 December, 2005 13:01
(at) To: pound(at)apsis.ch
(at) Subject: Re: [Pound Mailing List] Bugreport for Pound-2.0b4
(at)
(at)
(at) On Fri, 2005-12-16 at 11:20 +0100, Simon Matter wrote:
(at) > Okay, it seems to work fine now. I have set TimeOut to a large
(at) number and
(at) > it works perfect, and also a restarted backend resurrects without
(at) > problems.
(at)
(at) Great. Thanks for letting us know.
(at)
(at) > Now my only question is wouldn't it make sense to have a global
default
(at) > TimeOut parameter so we don't have to set it for all backends. If
time
(at) > could be set in the Global and/or the Backend Session, this
(at) would be fine.
(at) > What do you mean?
(at)
(at) There is a default value, which is hopefully a good compromise. I would
(at) be happy to change the default if somebody comes up with a good
(at) suggestion. 15 seconds seems OK for the vast majority of servers, which
(at) are quite lightly loaded - we could probably use something much lower,
(at) like 5, without any adverse effects.
(at)
(at) Beyond that I can't really see how a global value would make sense -
(at) after all each back-end is different, and probably needs a different
(at) time-out.
(at) --
(at) Robert Segall
(at) Apsis GmbH
(at) Postfach, Uetikon am See, CH-8707
(at) Tel: +41-44-920 4904
(at)
(at)
(at) --
(at) To unsubscribe send an email with subject 'unsubscribe' to
pound(at)apsis.ch.
(at) Please contact roseg(at)apsis.ch for questions.
(at) http://www.apsis.ch/pound/pound_list/archive/2005/2005-12/11343806
(at) 32000/1134734463000

Hello Robert,

I can live with a per backend setting but your statement is not universally
true: we have about 20 UrlGroups (for different HeadRequire Host values).
They all contain three backends which are bascially the same three server
instances, only with varying port numbers.

For me not only are the backends logically equal within a UrlGroup, they are
equal globally.

Just my two cents,

Regards,

Silvio Bierman

MailBoxer