Dear list,
I'm experiencing the following problem with Pound 2.4.1 under FreeBSD
6.3. Unfortunately, I was unable to reproduce the problem in a test
environment yet, so I can't tell exactly what is the cause. On the
same production hardware with the same config file Pound 2.3 works fine.
To simplify I have a balancer server and two worker servers.
Here is the configuration file for pound:
-------------------------------------------------------
DynScale 0
Alive 10
ListenHTTPS
Address balancer.some.domain
Port 443
Client 15
Cert "/etc/balancer.some.domain.pem"
Ciphers
"ALL:!ADH:!EXPORT56:RC4+RSA:+HIGH:+MEDIUM:+LOW:+SSLv2:+EXP:+eNULL"
End
Service
DynScale 0
URL "/FooService"
BackEnd
Address hawaii.vlan
Port 8001
TimeOut 45
End
BackEnd
Address tahiti.vlan
Port 8001
TimeOut 45
End
Session
Type URL
ID "session"
TTL 3600
End
End
-------------------------------------------------------
Now, the FooService is only running on one server, hawaii. I practice
leaving references to not-yet-existing servers in Pound's config file
as a room for future expansion.
As Pound starts and clients start sending requests for FooService,
there is an increasing flow of
-------------------------------------
(12EF) connect_nb: error after getsockopt: connection refused
(12EF) backend 1.2.3.4 connect: invalid socket
(12EF) connect_nb: error after getsockopt: connection refused
(12EF) backend 1.2.3.4 connect: invalid socket
(12EF) connect_nb: error after getsockopt: connection refused
(12EF) backend 1.2.3.4 connect: invalid socket
-------------------------------------
where 1.2.3.4 is the address for tahiti. Those messages keep
repeating not once a ten seconds, but much more frequently,
a few per second. Not at full machine speed perhaps, but I
have this impression that the loop in http.c:744 is continuing
in http.c:784 forever, as though the backend is not removed.
As more clients send requests, more threads (1234) start looping
and within a minute or two the link between balancer and tahiti
becomes saturated with SYN packets.
I'm not sure about the "invalid socket" line, it says something
like that, but it's rather expensive to reproduce now that I forgot
it (requires interruption of a production server), sorry.
As a wild guess, I thought the problem could be that in svc.c:828
if(error) {
/* getsockopt() shows an error */
errno = error;
logmsg(LOG_WARNING, "(%lx) connect_nb: error after getsockopt:
%s", pthread_self(), strerror(errno));
return -1;
}
as the logmsg thrashes the errno value, but swapping the logmsg and
errno lines didn't help at all. By examining the connect_nb and kill_be
I couldn't find anything relevant.
Any suggestions please ?
Thank you.
Dmitry Dvoinikov
http://www.targeted.org/
|