/ Zope / Apsis / Pound Mailing List / Archive / 2006 / 2006-06 / dying/resurrecting backends

[ << ] [ >> ]

[ ANNOUNCE: Pound - reverse proxy and load balancer ... ] [ Pound --> Apache --> Subversion / ... ]

dying/resurrecting backends
"Timur Evdokimov" <timur(at)jacum.com>
2006-06-06 21:15:46 [ FULL ]
Dear all,

I'm experiencing strange problems with pound.

Namely, messages like this one appear in log quite often, approx. once per 5
minutes:

Jun  6 20:54:00 lb01 pound: backend 10.0.0.21:8080 connect: Connection timed
out
Jun  6 20:54:04 lb01 pound: BackEnd 10.0.0.21 resurrect

I use cookie-based sticky sessions on this server group.
As an imminent consequence, when a backend dies, all sessions to this
backend are reset, and the users would have to log on again. This is very
very very annoying.

What is very suprising is that pound running as a process at one of backends
and reverse proxying to the same as well as other backends doesn't
experience this problem.
So I assume there are no problems with backends at all. 

Now all the servers and reverse proxy machine are connected to the same
switch, so it just can't be the hardware causing this.

I've tried to apply these settings at reverse proxy machine:
ulimit -n 16000
echo "1" > /proc/sys/net/ipv4/tcp_tw_recycle
...but this didn't help.

This all is with pound 1.10. 
I've tried 2.0.5 - alas with the same outcome.
RedHat FC4 is running on all machines.
Backend Apache is 2.0.55.
Backends are running at 3-5% of their CPU capacity, so there's no reason for
timeout.

This is my pound.cfg:

User            nobody
Group           nobody
ExtendedHTTP    0
WebDAV          0
LogLevel        0
Alive           5
Server 120
Client 120
RewriteRedirect 0

ListenHTTP 123.123.123.123,80

UrlGroup ".*"
        BackEnd 10.0.0.21,8080,1
        BackEnd 10.0.0.22,8080,1
        BackEnd 10.0.0.23,8080,1
        Session COOKIE PHPSESSID 7200
EndGroup

At this momen I've out of the ideas.
I might try with yet another server to exclude possibility of fault network
card or network driver... 

Could you please give me any clue to what can be wrong here or what else
could I test?

Any help will be **definitely** appreciated.

Kind regards,
Timur 

[...]

Re: [Pound Mailing List] dying/resurrecting backends
Robert Segall <roseg(at)apsis.ch>
2006-06-08 19:04:16 [ FULL ]
On Tue, 2006-06-06 at 21:15 +0200, Timur Evdokimov wrote:[...]

It sounds very much like a firewall/networking problem. Somehow Pound
can't open a connection to the back-end within the defined timeout.

The fact that it works correctly on the back-end machine tends to
support this suspicion.

I suggest you check carefully on your networking and/or firewall
definitions to see what may be causing it.[...]

RE: [Pound Mailing List] dying/resurrecting backends
"Timur Evdokimov" <timur(at)jacum.com>
2006-06-08 21:03:52 [ FULL ]
Robert,

Thank you for very useful hint.

At least I know what it is now. It is *very* strange, but when iptables at
the backends is completely shut down (ACCEPT for everything) there's no
timeouts and everything looks much better. It must be IP connection tracking
table that becomes full or something like this.

I wonder if there's anything like an advice on optimal configuration for
TCP/IP and iptables for pound.
Or maybe someone can share his/her netfilter script?

The funny is that I did test the backend/pound combination very thoroughly
before setting in production, hours and hours, and with up to 500
requests/second, but I didn't turn backends' netfilters on. 

Regards,
Timur

-----Original Message-----
From: Robert Segall [mailto:roseg(at)apsis.ch] 
Sent: Thursday, June 08, 2006 7:04 PM
To: pound(at)apsis.ch
Subject: Re: [Pound Mailing List] dying/resurrecting backends

On Tue, 2006-06-06 at 21:15 +0200, Timur Evdokimov wrote:[...]

It sounds very much like a firewall/networking problem. Somehow Pound can't
open a connection to the back-end within the defined timeout.

The fact that it works correctly on the back-end machine tends to support
this suspicion.

I suggest you check carefully on your networking and/or firewall definitions
to see what may be causing it.
--
Robert Segall
Apsis GmbH
Postfach, Uetikon am See, CH-8707
Tel: +41-44-920 4904

[...]

MailBoxer