/ Zope / Apsis / Pound Mailing List / Archive / 2007 / 2007-05 / host dead/resurrect on static content.

[ << ] [ >> ]

[ Re: [Pound Mailing List] Re: excess CPU usage? / ... ] [ the ancient time_wait problem / Khaled Hassounah ... ]

host dead/resurrect on static content.
Albert <pound(at)alacra.com>
2007-05-05 21:48:28 [ SNIP ]
I noticed a strange problem with one of my services (specifically 
created for static content), and I was wondering if anybody else has 
seen the issue.  Here's the pound config for the service:

Service
        HeadRequire     "Host: .*mydomain.com.*"
        URL             ".*\.(gif|jpg|js|css|ico)$"
        BackEnd
                Address 192.168.65.120
                Port    8080
                HAPort  5555
        End
        BackEnd
                Address 192.168.1.120
                Port    8080
                HAPort  5555
        End
        BackEnd
                Address 192.168.65.177
                Port    8080
                HAPort  5555
        End
        BackEnd
                Address 192.168.1.34
                Port    8080
                HAPort  5555
        End
End


I have an almost identical Service for non-static content (without URL 
directive) that goes to the same 4 servers, but on port 80 for 
backends.  Here's what I see in my pound error log (I made a minor 
change in http.c to display the request that causes the time outs) :

May  5 12:08:18 rp1 pound: backend 192.168.65.120:8080 connect: 
Connection timed out, GET /pubsite/images/alacra_store_logo_left.gif 
HTTP/1.1
May  5 12:08:18 rp1 pound: BackEnd 192.168.65.120:8080 resurrect
May  5 12:11:42 rp1 pound: backend 192.168.1.120:8080 connect: 
Connection timed out, GET /pubsite/script/preferences.js HTTP/1.1
May  5 12:11:48 rp1 pound: BackEnd 192.168.1.120:8080 resurrect
May  5 13:04:22 rp1 pound: backend 192.168.1.34:8080 connect: Connection 
timed out, GET /pubsite/images/flags/gif/VC.gif HTTP/1.1
May  5 13:04:48 rp1 pound: BackEnd 192.168.1.34:8080 resurrect
May  5 13:32:26 rp1 pound: backend 192.168.65.177:8080 connect: 
Connection timed  out, GET /pubsite/images/flags/gif/CL.gif HTTP/1.1
May  5 13:32:35 rp1 pound: BackEnd 192.168.65.177:8080 resurrect
May  5 14:29:07 rp1 pound: backend 192.168.1.34:8080 connect: Connection 
timed out, GET /pubsite/images/homesearch.gif HTTP/1.1
May  5 14:29:14 rp1 pound: BackEnd 192.168.1.34:8080 resurrect
May  5 14:40:13 rp1 pound: backend 192.168.1.34:8080 connect: Connection 
timed out, GET /pubsite/script/pubsite.js HTTP/1.1
May  5 14:40:30 rp1 pound: BackEnd 192.168.1.34:8080 resurrect
May  5 14:47:05 rp1 pound: backend 192.168.1.120:8080 connect: 
Connection timed out, GET /pubsite/images/magnifying.gif HTTP/1.1
May  5 14:47:30 rp1 pound: BackEnd 192.168.1.120:8080 resurrect
May  5 14:48:52 rp1 pound: backend 192.168.1.120:8080 connect: 
Connection timed out, GET /pubsite/images/searching.gif HTTP/1.1
May  5 14:49:00 rp1 pound: BackEnd 192.168.1.120:8080 resurrect


All of these images are tiny in size (no more than 10 KBs, some as small 
as 600 bytes). I'm guessing the problem has to do with the traffic on 
our site, but that doesn't explain why I don't see "Connection time out" 
messages for non-static content on port 80.  Its almost like we have the 
problem with static content only (packet size related?).  I have a 
global "TimeOut" set to 300 and "Alive" set 15.  We're running IIS 6.0 
on the backends. In IIS, I have a Connection Time out set to 10 seconds 
for port 8080 requests.


Any ideas would be appreciated.

Albert


Attachments:  
text.html text/html 5979 Bytes

Re: [Pound Mailing List] host dead/resurrect on static content.
Robert Segall <roseg(at)apsis.ch>
2007-05-08 17:32:38 [ SNIP ]
On Sat, 2007-05-05 at 15:48 -0400, Albert wrote:
> I noticed a strange problem with one of my services (specifically 
> created for static content), and I was wondering if anybody else has 
> seen the issue.  Here's the pound config for the service:
> 
> Service
>         HeadRequire     "Host: .*mydomain.com.*"
>         URL             ".*\.(gif|jpg|js|css|ico)$"
>         BackEnd
>                 Address 192.168.65.120
>                 Port    8080
>                 HAPort  5555
>         End
>         BackEnd
>                 Address 192.168.1.120
>                 Port    8080
>                 HAPort  5555
>         End
>         BackEnd
>                 Address 192.168.65.177
>                 Port    8080
>                 HAPort  5555
>         End
>         BackEnd
>                 Address 192.168.1.34
>                 Port    8080
>                 HAPort  5555
>         End
> End
> 
> 
> I have an almost identical Service for non-static content (without URL 
> directive) that goes to the same 4 servers, but on port 80 for 
> backends.  Here's what I see in my pound error log (I made a minor 
> change in http.c to display the request that causes the time outs) :
> 
> May  5 12:08:18 rp1 pound: backend 192.168.65.120:8080 connect: 
> Connection timed out, GET /pubsite/images/alacra_store_logo_left.gif 
> HTTP/1.1
> May  5 12:08:18 rp1 pound: BackEnd 192.168.65.120:8080 resurrect
> May  5 12:11:42 rp1 pound: backend 192.168.1.120:8080 connect: 
> Connection timed out, GET /pubsite/script/preferences.js HTTP/1.1
> May  5 12:11:48 rp1 pound: BackEnd 192.168.1.120:8080 resurrect
> May  5 13:04:22 rp1 pound: backend 192.168.1.34:8080 connect: Connection 
> timed out, GET /pubsite/images/flags/gif/VC.gif HTTP/1.1
> May  5 13:04:48 rp1 pound: BackEnd 192.168.1.34:8080 resurrect
> May  5 13:32:26 rp1 pound: backend 192.168.65.177:8080 connect: 
> Connection timed  out, GET /pubsite/images/flags/gif/CL.gif HTTP/1.1
> May  5 13:32:35 rp1 pound: BackEnd 192.168.65.177:8080 resurrect
> May  5 14:29:07 rp1 pound: backend 192.168.1.34:8080 connect: Connection 
> timed out, GET /pubsite/images/homesearch.gif HTTP/1.1
> May  5 14:29:14 rp1 pound: BackEnd 192.168.1.34:8080 resurrect
> May  5 14:40:13 rp1 pound: backend 192.168.1.34:8080 connect: Connection 
> timed out, GET /pubsite/script/pubsite.js HTTP/1.1
> May  5 14:40:30 rp1 pound: BackEnd 192.168.1.34:8080 resurrect
> May  5 14:47:05 rp1 pound: backend 192.168.1.120:8080 connect: 
> Connection timed out, GET /pubsite/images/magnifying.gif HTTP/1.1
> May  5 14:47:30 rp1 pound: BackEnd 192.168.1.120:8080 resurrect
> May  5 14:48:52 rp1 pound: backend 192.168.1.120:8080 connect: 
> Connection timed out, GET /pubsite/images/searching.gif HTTP/1.1
> May  5 14:49:00 rp1 pound: BackEnd 192.168.1.120:8080 resurrect

Just a wild guess: are your Alive and TimeOut definitions AFTER the
Service? In that case they do not apply to previously defined services,
which would be consistent with your log.

> All of these images are tiny in size (no more than 10 KBs, some as small 
> as 600 bytes). I'm guessing the problem has to do with the traffic on 
> our site, but that doesn't explain why I don't see "Connection time out" 
> messages for non-static content on port 80.  Its almost like we have the 
> problem with static content only (packet size related?).  I have a 
> global "TimeOut" set to 300 and "Alive" set 15.  We're running IIS 6.0 
> on the backends. In IIS, I have a Connection Time out set to 10 seconds 
> for port 8080 requests.
> 
> 
> Any ideas would be appreciated.

Have a look at your config file - I'm quite sure that the content type
has nothing to do with the connection time-out.
-- 
Robert Segall
Apsis GmbH
Postfach, Uetikon am See, CH-8707
Tel: +41-44-920 4904


Re: [Pound Mailing List] host dead/resurrect on static content.
Albert <pound(at)alacra.com>
2007-05-09 19:45:27 [ SNIP ]

> Just a wild guess: are your Alive and TimeOut definitions AFTER the
> Service? In that case they do not apply to previously defined services,
> which would be consistent with your log.
>   
Yes, the global options are defined on the top of my config file.
> Have a look at your config file - I'm quite sure that the content type
> has nothing to do with the connection time-out.
>   
Yes, I'm pretty sure its not content related.  I will run few more tests 
(including putting static content together with dynamic content on port 
80).  I believe at this point its not pound but something on our network 
(between the firewall & switches).  Our pound server sits on DMZ and 
backends on the internal network. I'm running some network tests now, 
and trying to figure out why the small number of (static) requests are 
not getting to our backends altogether.




Re: [Pound Mailing List] host dead/resurrect on static content.
Albert <pound(at)alacra.com>
2007-05-11 00:35:29 [ SNIP ]
>>  
> Yes, I'm pretty sure its not content related.  I will run few more 
> tests (including putting static content together with dynamic content 
> on port 80).  I believe at this point its not pound but something on 
> our network (between the firewall & switches).  Our pound server sits 
> on DMZ and backends on the internal network. I'm running some network 
> tests now, and trying to figure out why the small number of (static) 
> requests are not getting to our backends altogether.
>
First off, I want to apologize in using this forum to figure out this 
problem, but it might be related to pound (though at this point I'm 
thinking its something else).

Two items I wanted to mention:
1.  The problems that we're experiencing is with static content, which 
is being retrieved on the open connection already (so, the client might 
have received bunch of images, and is stuck in the middle).  However, it 
looks like pound is calling connect_nb () on line 1241 of v2.3.1, 
meaning that the backend connection was closed few lines before.  The 
connect_nb() than waits the TimeOut value to connect, but fails, and 
thats when pound kills the backend, thinking its dead.  Are there any 
tools I can use to figure out why this is happening?  Whats interesting 
is I see other clients making requests and going to the backend during 
the TimeOut interval (when I had it set for 5 minutes, I would see alot 
of content requested and served).

2. I believe the time out value used for connect_nb() to backends should 
be less than the general TimeOut (or at least I should be able to define 
a lower timeout).  Here's my thinking, the connect_nb should be really 
quick (if I don't connection in say 2 seconds, then maybe I should not 
use that backend at all?).  However, I don't want to set my TimeOut to 
be too low, as the content I'm delivering might take some time to 
generate (after establishing the connection).  I think it makes sense to 
have a "ConnectTimeOut", which can be optionally set, to use for 
connecting to backends.  We do alot of database coding here, and we have 
2 separate time outs: one to connect to a database, and another to run 
the query.  I think the same approach should be used here.

Albert


MailBoxer