|
/
Zope
/
Apsis
/
Pound Mailing List
/
Archive
/
2007
/
2007-05
/
host dead/resurrect on static content.
[
Re: [Pound Mailing List] Re: excess CPU usage? / ... ]
[
the ancient time_wait problem / Khaled Hassounah ... ]
host dead/resurrect on static content.
Albert <pound(at)alacra.com> |
2007-05-05 21:48:28 |
[ SNIP ]
|
I noticed a strange problem with one of my services (specifically
created for static content), and I was wondering if anybody else has
seen the issue. Here's the pound config for the service:
Service
HeadRequire "Host: .*mydomain.com.*"
URL ".*\.(gif|jpg|js|css|ico)$"
BackEnd
Address 192.168.65.120
Port 8080
HAPort 5555
End
BackEnd
Address 192.168.1.120
Port 8080
HAPort 5555
End
BackEnd
Address 192.168.65.177
Port 8080
HAPort 5555
End
BackEnd
Address 192.168.1.34
Port 8080
HAPort 5555
End
End
I have an almost identical Service for non-static content (without URL
directive) that goes to the same 4 servers, but on port 80 for
backends. Here's what I see in my pound error log (I made a minor
change in http.c to display the request that causes the time outs) :
May 5 12:08:18 rp1 pound: backend 192.168.65.120:8080 connect:
Connection timed out, GET /pubsite/images/alacra_store_logo_left.gif
HTTP/1.1
May 5 12:08:18 rp1 pound: BackEnd 192.168.65.120:8080 resurrect
May 5 12:11:42 rp1 pound: backend 192.168.1.120:8080 connect:
Connection timed out, GET /pubsite/script/preferences.js HTTP/1.1
May 5 12:11:48 rp1 pound: BackEnd 192.168.1.120:8080 resurrect
May 5 13:04:22 rp1 pound: backend 192.168.1.34:8080 connect: Connection
timed out, GET /pubsite/images/flags/gif/VC.gif HTTP/1.1
May 5 13:04:48 rp1 pound: BackEnd 192.168.1.34:8080 resurrect
May 5 13:32:26 rp1 pound: backend 192.168.65.177:8080 connect:
Connection timed out, GET /pubsite/images/flags/gif/CL.gif HTTP/1.1
May 5 13:32:35 rp1 pound: BackEnd 192.168.65.177:8080 resurrect
May 5 14:29:07 rp1 pound: backend 192.168.1.34:8080 connect: Connection
timed out, GET /pubsite/images/homesearch.gif HTTP/1.1
May 5 14:29:14 rp1 pound: BackEnd 192.168.1.34:8080 resurrect
May 5 14:40:13 rp1 pound: backend 192.168.1.34:8080 connect: Connection
timed out, GET /pubsite/script/pubsite.js HTTP/1.1
May 5 14:40:30 rp1 pound: BackEnd 192.168.1.34:8080 resurrect
May 5 14:47:05 rp1 pound: backend 192.168.1.120:8080 connect:
Connection timed out, GET /pubsite/images/magnifying.gif HTTP/1.1
May 5 14:47:30 rp1 pound: BackEnd 192.168.1.120:8080 resurrect
May 5 14:48:52 rp1 pound: backend 192.168.1.120:8080 connect:
Connection timed out, GET /pubsite/images/searching.gif HTTP/1.1
May 5 14:49:00 rp1 pound: BackEnd 192.168.1.120:8080 resurrect
All of these images are tiny in size (no more than 10 KBs, some as small
as 600 bytes). I'm guessing the problem has to do with the traffic on
our site, but that doesn't explain why I don't see "Connection time out"
messages for non-static content on port 80. Its almost like we have the
problem with static content only (packet size related?). I have a
global "TimeOut" set to 300 and "Alive" set 15. We're running IIS 6.0
on the backends. In IIS, I have a Connection Time out set to 10 seconds
for port 8080 requests.
Any ideas would be appreciated.
Albert
|
|
|
|
|
Re: [Pound Mailing List] host dead/resurrect on static content.
Robert Segall <roseg(at)apsis.ch> |
2007-05-08 17:32:38 |
[ SNIP ]
|
On Sat, 2007-05-05 at 15:48 -0400, Albert wrote:
> I noticed a strange problem with one of my services (specifically
> created for static content), and I was wondering if anybody else has
> seen the issue. Here's the pound config for the service:
>
> Service
> HeadRequire "Host: .*mydomain.com.*"
> URL ".*\.(gif|jpg|js|css|ico)$"
> BackEnd
> Address 192.168.65.120
> Port 8080
> HAPort 5555
> End
> BackEnd
> Address 192.168.1.120
> Port 8080
> HAPort 5555
> End
> BackEnd
> Address 192.168.65.177
> Port 8080
> HAPort 5555
> End
> BackEnd
> Address 192.168.1.34
> Port 8080
> HAPort 5555
> End
> End
>
>
> I have an almost identical Service for non-static content (without URL
> directive) that goes to the same 4 servers, but on port 80 for
> backends. Here's what I see in my pound error log (I made a minor
> change in http.c to display the request that causes the time outs) :
>
> May 5 12:08:18 rp1 pound: backend 192.168.65.120:8080 connect:
> Connection timed out, GET /pubsite/images/alacra_store_logo_left.gif
> HTTP/1.1
> May 5 12:08:18 rp1 pound: BackEnd 192.168.65.120:8080 resurrect
> May 5 12:11:42 rp1 pound: backend 192.168.1.120:8080 connect:
> Connection timed out, GET /pubsite/script/preferences.js HTTP/1.1
> May 5 12:11:48 rp1 pound: BackEnd 192.168.1.120:8080 resurrect
> May 5 13:04:22 rp1 pound: backend 192.168.1.34:8080 connect: Connection
> timed out, GET /pubsite/images/flags/gif/VC.gif HTTP/1.1
> May 5 13:04:48 rp1 pound: BackEnd 192.168.1.34:8080 resurrect
> May 5 13:32:26 rp1 pound: backend 192.168.65.177:8080 connect:
> Connection timed out, GET /pubsite/images/flags/gif/CL.gif HTTP/1.1
> May 5 13:32:35 rp1 pound: BackEnd 192.168.65.177:8080 resurrect
> May 5 14:29:07 rp1 pound: backend 192.168.1.34:8080 connect: Connection
> timed out, GET /pubsite/images/homesearch.gif HTTP/1.1
> May 5 14:29:14 rp1 pound: BackEnd 192.168.1.34:8080 resurrect
> May 5 14:40:13 rp1 pound: backend 192.168.1.34:8080 connect: Connection
> timed out, GET /pubsite/script/pubsite.js HTTP/1.1
> May 5 14:40:30 rp1 pound: BackEnd 192.168.1.34:8080 resurrect
> May 5 14:47:05 rp1 pound: backend 192.168.1.120:8080 connect:
> Connection timed out, GET /pubsite/images/magnifying.gif HTTP/1.1
> May 5 14:47:30 rp1 pound: BackEnd 192.168.1.120:8080 resurrect
> May 5 14:48:52 rp1 pound: backend 192.168.1.120:8080 connect:
> Connection timed out, GET /pubsite/images/searching.gif HTTP/1.1
> May 5 14:49:00 rp1 pound: BackEnd 192.168.1.120:8080 resurrect
Just a wild guess: are your Alive and TimeOut definitions AFTER the
Service? In that case they do not apply to previously defined services,
which would be consistent with your log.
> All of these images are tiny in size (no more than 10 KBs, some as small
> as 600 bytes). I'm guessing the problem has to do with the traffic on
> our site, but that doesn't explain why I don't see "Connection time out"
> messages for non-static content on port 80. Its almost like we have the
> problem with static content only (packet size related?). I have a
> global "TimeOut" set to 300 and "Alive" set 15. We're running IIS 6.0
> on the backends. In IIS, I have a Connection Time out set to 10 seconds
> for port 8080 requests.
>
>
> Any ideas would be appreciated.
Have a look at your config file - I'm quite sure that the content type
has nothing to do with the connection time-out.
--
Robert Segall
Apsis GmbH
Postfach, Uetikon am See, CH-8707
Tel: +41-44-920 4904
|
|
|
Re: [Pound Mailing List] host dead/resurrect on static content.
Albert <pound(at)alacra.com> |
2007-05-09 19:45:27 |
[ SNIP ]
|
> Just a wild guess: are your Alive and TimeOut definitions AFTER the
> Service? In that case they do not apply to previously defined services,
> which would be consistent with your log.
>
Yes, the global options are defined on the top of my config file.
> Have a look at your config file - I'm quite sure that the content type
> has nothing to do with the connection time-out.
>
Yes, I'm pretty sure its not content related. I will run few more tests
(including putting static content together with dynamic content on port
80). I believe at this point its not pound but something on our network
(between the firewall & switches). Our pound server sits on DMZ and
backends on the internal network. I'm running some network tests now,
and trying to figure out why the small number of (static) requests are
not getting to our backends altogether.
|
|
|
Re: [Pound Mailing List] host dead/resurrect on static content.
Albert <pound(at)alacra.com> |
2007-05-11 00:35:29 |
[ SNIP ]
|
>>
> Yes, I'm pretty sure its not content related. I will run few more
> tests (including putting static content together with dynamic content
> on port 80). I believe at this point its not pound but something on
> our network (between the firewall & switches). Our pound server sits
> on DMZ and backends on the internal network. I'm running some network
> tests now, and trying to figure out why the small number of (static)
> requests are not getting to our backends altogether.
>
First off, I want to apologize in using this forum to figure out this
problem, but it might be related to pound (though at this point I'm
thinking its something else).
Two items I wanted to mention:
1. The problems that we're experiencing is with static content, which
is being retrieved on the open connection already (so, the client might
have received bunch of images, and is stuck in the middle). However, it
looks like pound is calling connect_nb () on line 1241 of v2.3.1,
meaning that the backend connection was closed few lines before. The
connect_nb() than waits the TimeOut value to connect, but fails, and
thats when pound kills the backend, thinking its dead. Are there any
tools I can use to figure out why this is happening? Whats interesting
is I see other clients making requests and going to the backend during
the TimeOut interval (when I had it set for 5 minutes, I would see alot
of content requested and served).
2. I believe the time out value used for connect_nb() to backends should
be less than the general TimeOut (or at least I should be able to define
a lower timeout). Here's my thinking, the connect_nb should be really
quick (if I don't connection in say 2 seconds, then maybe I should not
use that backend at all?). However, I don't want to set my TimeOut to
be too low, as the content I'm delivering might take some time to
generate (after establishing the connection). I think it makes sense to
have a "ConnectTimeOut", which can be optionally set, to use for
connecting to backends. We do alot of database coding here, and we have
2 separate time outs: one to connect to a database, and another to run
the query. I think the same approach should be used here.
Albert
|
|
|
|