/ Zope / Apsis / Pound Mailing List / Archive / 2004 / 2004-08 / Memory leak update

[ << ] [ >> ]

[ Re: Server error 500 - response error read from ... ] [ Re: Pound / ngethek(at)netscape.net (Kenneth ... ]

Memory leak update
Thierry Coopman <thierry(at)keytradebank.com>
2004-08-02 20:05:07 [ FULL ]
Hi,

I've been running Pound for a bit more than one week now.

It seems to be a lot better in the -current than in the 1.7 version.
However, I think the problem is not completely gone. The process now 
takes a bit more than 100MB. Now, I had this before after about 4 hours, 
so something really helped :)

Now, can I help to find these last little tiny leaks so that this can 
run cleanly for days, weeks and months :)



Another remark/question is that I have the impression that our site is 
running slow when we get a lot of hits (120/sec). The load is 
distributed over 2 servers running Pound.
I checked the machines, running at 20 to 25% CPU load, an overall load 
of 0.6. So network I/O is fast enough, disks are non issue (since Pound 
doesn't use them)...

Now, I might have some things to optimise in the Linux kernel world. I 
can't seem to get these nptl thing to work, there might be some TCP/IP  
tunings to do.
Does the list have some other points to look for? The machines are 3GHz 
P4 with 2GB of ram.
[...]

Re: Memory leak update
"Simon Matter" <simon.matter(at)ch.sauter-bc.com>
2004-08-02 21:43:12 [ FULL ]
> Hi,[...]

Looks quite similar to what I have now. Since I have much less load I'm
just approaching the 20MB line, which has been reached after 2 or 3 hours
before the upgrade.
[...]

Re: Memory leak update
Robert Segall <roseg(at)apsis.ch>
2004-08-03 14:46:33 [ FULL ]
On Monday 02 August 2004 20.05, Thierry Coopman wrote:[...]

I'd love having some help on this - I really don't know where to look next. 
Any and all ideas are welcome.
[...]

The -current version may generate more packets than the previous one, so it 
just might be possible for it to seem a bit slower, but I very much doubt it. 
A load average of 0.6 is not exactly a very busy machine, so the problem lies 
elsewhere. I would check network statistics, such as collision and error 
rates. BTW: are you running separate network cards for incoming and 
back-ends? If it is a single card and your requests are largeish (responses 
of about 25K) you may be approaching Ethernet saturation point at 120 
reqs/sec.
[...]

That's a lot of hardware for relatively light load - certainly enough for your 
needs. I do reccomend trying NPTL, as it seems to bring a significant 
performance boost.[...]

Re: Memory leak update
Thierry Coopman <thierry(at)keytradebank.com>
2004-08-03 15:09:01 [ FULL ]
Robert Segall wrote:
[...]
* OpenSSL in threading setup
* I use OpenSSL session (for Mac users :)
* Pound users not using SSL are unaffected (can somebody confirm this)
* I use a Server Gated Crypto cert (with an intermediate cert) from Verisign

that's all of the specifics I can think of. Now, could we try to let the 
code be reviewed by one of the OpenSSL developpers?
[...][...]
peak is 7-8 Mbps, so devided over 2 machines this gives 3-4 Mbps per 
machine on both interfaces. Most responses are very small (304 not 
changed statusses). About errors, here's the output of netstat -i:
Kernel Interface table
Iface     MTU Met   RX-OK RX-ERR RX-DRP RX-OVR   TX-OK TX-ERR TX-DRP 
TX-OVR Flg
eth0       1500   0116321995      0      0      0126501814      0      
0      0 BMRU
eth1       1500   095988574      0      0      0108237490      0      
0      0 BMRU
lo        16436   0       0      0      0      0       0      0      
0      0 LRU

Now that said, this effectively doubles the number of connections in the 
TCP/IP tables, so I'm going to look if I can do some more tweaking on 
that, like right now I have 2200 connections in TIME_WAIT state that 
seems a lot...
[...][...]
Something in Gentoo doesn't agree with me :)


[...]

RE: Memory leak update
John D <jwdavid(at)ibizvision.com>
2004-08-03 16:51:31 [ FULL ]
Howdy,

I just wanted to share my situation as well. Pound was last restarted on
Saturday, early in the morning. When it first starts up it consumes: 1496k but
by today it was using: 84m.

We have also noticed slowness, but not the same as described below. Most of the
time, all works really fast, but every once in a while a connection will take
really long (from the EUs point of view). I haven't really looked into where
the problem lies, but after restarting Pound, the slow responses seem farther
between.

FYI this server runs Mandrake 9.2 from what I can tell NPTL is not available on
this system. This is the latest -current (Jul 20 2004).

Any insight into this issue will be helpfull.

John D.

********** Original Email *********
** To:   Robert Segall <roseg(at)apsis.ch>
** From: Thierry Coopman <thierry(at)keytradebank.com>
** Date: Tue, 03 Aug 2004 15:09:01 +0200
**********

Robert Segall wrote:
[...]
* OpenSSL in threading setup
* I use OpenSSL session (for Mac users :)
* Pound users not using SSL are unaffected (can somebody confirm this)
* I use a Server Gated Crypto cert (with an intermediate cert) from Verisign

that's all of the specifics I can think of. Now, could we try to let the 
code be reviewed by one of the OpenSSL developpers?
[...][...]
peak is 7-8 Mbps, so devided over 2 machines this gives 3-4 Mbps per 
machine on both interfaces. Most responses are very small (304 not 
changed statusses). About errors, here's the output of netstat -i:
Kernel Interface table
Iface     MTU Met   RX-OK RX-ERR RX-DRP RX-OVR   TX-OK TX-ERR TX-DRP 
TX-OVR Flg
eth0       1500   0116321995      0      0      0126501814      0      
0      0 BMRU
eth1       1500   095988574      0      0      0108237490      0      
0      0 BMRU
lo        16436   0       0      0      0      0       0      0      
0      0 LRU

Now that said, this effectively doubles the number of connections in the 
TCP/IP tables, so I'm going to look if I can do some more tweaking on 
that, like right now I have 2200 connections in TIME_WAIT state that 
seems a lot...
[...][...]
Something in Gentoo doesn't agree with me :)


[...]

Re: Memory leak update
Robert Segall <roseg(at)apsis.ch>
2004-08-04 13:49:07 [ FULL ]
On Tuesday 03 August 2004 15.09, Thierry Coopman wrote:[...]

I agree with your diagnosis - if there is a problem it certainly occurs only 
with SSL connections. I very much doubt it has anything to do with the Server 
Gated Crypto.
[...]

That would be great. Are you in contact with them? Let me know - you have my 
full support.
[...]

You may want to check on the system parameters, such as delays after ACK.
[...]

Surely not ego.[...]

MailBoxer