/ Zope / Apsis / Pound Mailing List / Archive / 2007 / 2007-05 / the ancient time_wait problem

[ << ] [ >> ]

[ host dead/resurrect on static content. / Albert ... ] [ [PATCH] Pound-1.10 bug: terminating null not ... ]

the ancient time_wait problem
Khaled Hassounah <khaled.hassounah(at)medhelp.org>
2007-05-08 06:16:15 [ FULL ]
Hello,

I am looking more for ideas than silver bullets here.

Assuming memory and bandwidth are not a limitation, the maximum number
of transactions that can be serviced per second is limited by the
number of available ports and the period of time_wait. With an
ephemeral port range of 32768 through 65535. The theoretical limit will
be south of 546 requests per second (assuming the web server closes the
connection so only the client connections linger in time_wait).

Our traffic at peak is more than 800 requests/sec. Following are the
options I could think of/google. but was wondering if there is a
different way to handle this limitation:

  Add more IP addresses that pound listens on and have our domain
point to all those IPs, thus splitting traffic roughly between them.
  Increase the ephemeral range, maybe go down to 10000
  Lower time_wait period (I don't like this option)

any other ways to solve this?

Khaled
[...]
Attachments:  
text.html text/html 1344 Bytes

Re: [Pound Mailing List] the ancient time_wait problem
Ted Dunning <tdunning(at)veoh.com>
2007-05-08 07:57:53 [ FULL ]
I believe that modern Linux versions adjust the timeout according to demand.


On 5/7/07 9:16 PM, "Khaled Hassounah" <khaled.hassounah(at)medhelp.org>
wrote:
[...]

Re: [Pound Mailing List] the ancient time_wait problem
Stefan Lambrev <stefan.lambrev(at)sun-fish.com>
2007-05-08 08:32:45 [ FULL ]
Hi,

Khaled Hassounah wrote:[...]
Can you explain how and what actually will limit "port range of 32768 
through 65535"?
If I understand socket is ip:port - ip2:port, wich means that same 
port(s) can be used to connect to/from 2 different location.
So this limit 546rps is with 1 back-end ?[...]
I still do not understand why you think pound have this limit and web 
servers does not have it ?[...]
[...]

Re: [Pound Mailing List] the ancient time_wait problem
Khaled Hassounah <khaled.hassounah(at)medhelp.org>
2007-05-08 08:39:52 [ FULL ]
It is actually hard coded to 60 in newer kernels. You can even change
it. 

It is set in /usr/src/kernels/<your kernel>/include/net/tcp.h:
constant TCP_TIMEWAIT_LEN

Khaled

Ted Dunning wrote:

  I believe that modern Linux versions adjust the timeout according to demand.


On 5/7/07 9:16 PM, "Khaled Hassounah" <khaled.hassounah(at)medhelp.org>
wrote:

  
  
    Hello,

I am looking more for ideas than silver bullets here.

Assuming memory and bandwidth are not a limitation, the maximum number
of transactions that can be serviced per second is limited by the
number of available ports and the period of time_wait. With an
ephemeral port range of 32768 through 65535. The theoretical limit will
be south of 546 requests per second (assuming the web server closes the
connection so only the client connections linger in time_wait).

Our traffic at peak is more than 800 requests/sec. Following are the
options I could think of/google. but was wondering if there is a
different way to handle this limitation:

  Add more IP addresses that pound listens on and have our domain
point to all those IPs, thus splitting traffic roughly between them.
  Increase the ephemeral range, maybe go down to 10000
  Lower time_wait period (I don't like this option)

any other ways to solve this?

Khaled
    
  
  

  

[...]
Attachments:  
text.html text/html 2015 Bytes

Re: [Pound Mailing List] the ancient time_wait problem
Khaled Hassounah <khaled.hassounah(at)medhelp.org>
2007-05-08 08:48:37 [ FULL ]
Stefan,
 
Can
you explain how and what actually will limit "port range of 32768
through 65535"?
  
If I understand socket is ip:port - ip2:port, wich means that same
port(s) can be used to connect to/from 2 different location.
  
So this limit 546rps is with 1 back-end ?
  

OS's usually have a range (it is actually part of the tcp rfc) that
they use to assign to local ports. In linux it is in
/proc/sys/net/ipv4/ip_local_port_range, in windows it is the registry
setting MaxUserPorts (IIRC). Once all the ports in the range are used
(whether actually connected or lingering in time_wait or close_wait
states), they can't be used until they are free again. 

After the server closes the connection with the client, it puts the
connection in time_wait state for 60 seconds (in linux while the
standard says 240). What that means is that if you are using all the
sockets all the time, you will be able to use all the range every 60
seconds. If the range gives you 32k, then device that by 60 and you get
the max connections per second.

I approximated the number since not all connections are in time wait,
but some are in established state, but those are few compared to the
time_wait ones.

  Our traffic at peak is more than 800
requests/sec. Following are the
    
options I could think of/google. but was wondering if there is a
    
different way to handle this limitation:
    
    
  Add more IP addresses that pound listens on and have our domain
    
point to all those IPs, thus splitting traffic roughly between them.
    
  Increase the ephemeral range, maybe go down to 10000
    
  Lower time_wait period (I don't like this option)
    
    
any other ways to solve this?
    
  
I still do not understand why you think pound have this limit and web
servers does not have it ?
  

It is not pound, the same could happen with a web server. The
difference though is that pound would typically sit in front of a group
of web servers (it is a load balancer by design), so it will handle a
much higher rate of connections than will any of the web servers
sitting behind it.

Khaled
[...]
Attachments:  
text.html text/html 2790 Bytes

Re: [Pound Mailing List] the ancient time_wait problem
Stefan Lambrev <stefan.lambrev(at)sun-fish.com>
2007-05-08 09:30:07 [ FULL ]
Hi,


Khaled Hassounah wrote:[...]
Well I know this but my idea is that socket is IP:port - IP2:port2.
So what you have between pound and clients (browsers) is:
www-ip:80-clientIP:port

the local port (where pound runs) of the socket is aways 80 (or 443),
so the theoretical limit is 32K sockets per clientIP (well i tdepends on 
client configuration)

On the other side we have back-endIP:Port1 - pound IP:Port and here the 
socket limit is again 32K per back-end
so having 2 backends will increase this limit to 64K sockets.

Also most OSes have another limits for max number of sockets, and this 
is limitation of the OS.

My idea is, that there is no problem to have two sockets at the same 
time, that open the same local port, right ?
[...]
[...]

Re: [Pound Mailing List] the ancient time_wait problem
Khaled Hassounah <khaled.hassounah(at)medhelp.org>
2007-05-08 09:35:13 [ FULL ]
Well I
know this but my idea is that socket is IP:port - IP2:port2.
  
So what you have between pound and clients (browsers) is:
  
www-ip:80-clientIP:port
  
  
the local port (where pound runs) of the socket is aways 80 (or 443),
  
so the theoretical limit is 32K sockets per clientIP (well i tdepends
on client configuration)
  
  
On the other side we have back-endIP:Port1 - pound IP:Port and here the
socket limit is again 32K per back-end
  
so having 2 backends will increase this limit to 64K sockets.
  
  
Also most OSes have another limits for max number of sockets, and this
is limitation of the OS.
  
  
My idea is, that there is no problem to have two sockets at the same
time, that open the same local port, right ?
  
  

Yes. one of the options I listed is to add more IPs to the machine.

Khaled[...]
Attachments:  
text.html text/html 1374 Bytes

Re: [Pound Mailing List] the ancient time_wait problem
Robert Segall <roseg(at)apsis.ch>
2007-05-08 17:38:20 [ FULL ]
On Mon, 2007-05-07 at 21:16 -0700, Khaled Hassounah wrote:[...]

Your description doesn't quite agree with my understanding of TCP/IP.

As far as I know when you connect to a server socket only that socket
will be used - do a netstat and you'll see that all the clients are
connected to the same socket number. The ephemeral sockets are relevant
for client connections (i.e. connections to the back-end) and NOT to the
listeners.

Pound sets LINGER at 10 seconds and LINGER2 (if available) at 5 seconds
for client connections, so the default system values do not apply.

Finally, the proof is in the pudding: we have web sites with well over
500 requests per second, with no "out of sockets" errors.[...]

Re: [Pound Mailing List] the ancient time_wait problem
Khaled Hassounah <khaled.hassounah(at)medhelp.org>
2007-05-09 02:33:08 [ FULL ]
Your description doesn't quite agree with my understanding of TCP/IP.

As far as I know when you connect to a server socket only that socket
will be used - do a netstat and you'll see that all the clients are
connected to the same socket number. The ephemeral sockets are relevant
for client connections (i.e. connections to the back-end) and NOT to the
listeners.

Pound sets LINGER at 10 seconds and LINGER2 (if available) at 5 seconds
for client connections, so the default system values do not apply.

Finally, the proof is in the pudding: we have web sites with well over
500 requests per second, with no "out of sockets" errors.
  


You are right about the way it worked, I was going in the right path. I
went back looking for more clues to understand what is happening,  and
found that after about 4200 successful requests, I start getting a "no
back-end" error in my system log.

At the beginning (about 10 instances) the "no back-end" and "connect:
Connection timed out" errors alternate. Then "no back-end" appears more
frequently until it repeats few hundred times in succession with
"Connection timed out" only appearing tens of times after.

I am still investigating, but saw few discussions about this on the
mailing list (no satisfactory diagnosis), and thought someone might
have resolved a similar problem already.

P.S. For extra information, I tried running ab from the pound machine
against the web server and had no problems.


[...]
Attachments:  
text.html text/html 2007 Bytes

Re: [Pound Mailing List] the ancient time_wait problem
Stefan Lambrev <stefan.lambrev(at)sun-fish.com>
2007-05-09 07:41:40 [ FULL ]
Khaled Hassounah wrote:[...]
To make proper benchmark you need at least 3 different servers: back-end 
- pound - client (ab)
and again: back-end - client (ab) to test without pound.
Running ab on the back-end or the pound server will affect very  badly 
results as ab is quite greedy for resources :)
And your CPUs can spent their time generating requests then serving it.
Also can you provide some info about the network you use: 100mbit or 
gigabit network ?

Re: [Pound Mailing List] the ancient time_wait problem
Ted Dunning <tdunning(at)veoh.com>
2007-05-09 18:03:38 [ FULL ]
I think Stefan misunderstood this last note.

Khaled was doing the very correct thing of running the test client on the
pound machine WITHOUT running pound.  This verifies that the pound machine
can actually generate the transactions and send them to the backends
cleanly.  This helps eliminate the possibility that the network between the
pound and backends is the problem.

On 5/8/07 10:41 PM, "Stefan Lambrev" <stefan.lambrev(at)sun-fish.com>
wrote:
[...][...]

MailBoxer