Hi Jim,
What does ethtool -k eth[?] say on your domu and dom0?

I was receiving timeouts during backups with backupexec on Xen domu's until I turned tx-transmit off on the domus. It would be fine for a while until all of a sudden it would start loosing packets, then come back. But those few lost packets were enough to stall the backup.

Maybe something like that is going on with pound. It would also explain why you cannot reproduce the problem with ab.

Regards,
Jon

Jim Spath wrote:
Hi all,

First some info:
 Pound Version: 2.2.7 (also tested 2.3.2)
 OS:  Ubuntu 6.06 LTS running on Xen

We've been using Pound successfully for about 6 months here, but recently experienced some problems with it when one of our projects received a considerable increase in traffic.  We're seeing traffic of around 58 requests/second for this project.

At the same time traffic increased, we started getting the Pound "The service is not available" errors.  We increased the verbosity of the Pound logging and found that Pound was complaining about the backend servers timing out.

To solve this, we tried the following
 - Upgrading pound to the latest version.
 - Increasing the number of available file descriptors.
 - Increasing the TimeOut for each backend server.
 - Decreasing the Alive value.
 - Adding more backend servers.
 - Upgrading the backend servers.

None of these actions had any noticeable effect on the problem.

I ran some Apache benchmark tests on each of our backend servers and was surprised to find that I could not replicate the timeouts that Pound was complaining about (100% of requests came in under 50ms), nor could I find any evidence of the servers being overloaded.

This made me start thinking that the problem was Pound itself, or the machine it resides on, and that perhaps the connections that Pound claimed were "timing out" were not even making to the backend servers.

So, we changed DNS to point directly to a _single_ backend server, instead of Pound.  Once we did that everything worked perfectly.  The single server was able to handle the load with ease (we had 4 backend servers under pound).

I had read that Pound could handle 600 req/s.  Is this incorrect?  Does it have something to do with the fact that the requests in our project are not static content, but rather dynamic content which has a delay, albeit a small one, associated with it?  Or could it have something to do with the fact that it is running on a virtual server?

I'd prefer this project to be properly load balanced under Pound, so any advice you guys could offer would be appreciated.

- Jim


--
Jon Higgs
Systems Administrator

i-xplore Pty Ltd
"Over 25 Years of Xtra Value Deals!"

6 Watts Street, Box Hill Vic 3128 Australia
Direct: (613) 9245 0791
Email: jhiggs@ixplore.com.au
Website: www.ixplore.com.au
* Switch to e-brochures: it's the responsible "green" thing to do!
* Please consider the environment before printing this email


Attention:
This e-mail is privileged and confidential. If you are not the
intended recipient please delete the message and notify the sender.
Any views or opinions presented are solely those of the author.

This e-mail message has been scanned and cleared by MailMarshal - www.marshalsoftware.com and Sophos Anti-virus