/ Zope / Apsis / Pound Mailing List / Archive / 2007 / 2007-05 / error copy server cont: Connection reset by peer

[ << ] [ >> ]

[ Load balance appliance recommendations / ... ] [ Re: [Pound Mailing List] HTTP Method based ... ]

error copy server cont: Connection reset by peer
Jean-Francois Stenuit <jfs(at)skynet.be>
2007-05-02 09:54:00 [ SNIP ]
Hi list,

I'm coming back with this old story. Apparently, there have been some 
discussion about it on the list, but without a definitive answer.

Basically, the symptoms are, on a loaded server, that the logs fill with a 
lot of :
Apr 30 00:26:46 revproxy07 pound: error copy server cont: Connection reset by
peer

Unfortunately, the code is not very clear on what exactly causes the 
error, so I did some patches to find the command which cause this. I 
found out that the line causing the error is always the same, from 
function copy_bin() :
   95         if(BIO_flush(be) != 1)

Now, by reading the man page for BIO_flush(), I see that :
        BIO_flush() normally writes out any internally buffered data, in some
        cases it is used to signal EOF and that no more data will be written.

My question is therefore the following : shouldn't we use BIO_flush() 
outside of the "while" loop ? I.e. shouldn't we flush output to the client 
browser after all data have been written ?

Also, shouldn't we just ignore the return code of BIO_flush() ?

Anyway, the conclusion up to now is that some browsers, when receiving 
some answers, close the connection instead of waiting for the server to 
close connection. The next question is : which browser to blame ?

--
This mail is being sent from my personnal account. It reflects my own views
and not the ones of any of my employers.
-- 
  |--- Jean-Francois "Jef" Stenuit

Re: [Pound Mailing List] error copy server cont: Connection reset by peer
Robert Segall <roseg(at)apsis.ch>
2007-05-02 18:24:44 [ SNIP ]
On Wed, 2007-05-02 at 09:54 +0200, Jean-Francois Stenuit wrote:
> Hi list,
> 
> I'm coming back with this old story. Apparently, there have been some 
> discussion about it on the list, but without a definitive answer.
> 
> Basically, the symptoms are, on a loaded server, that the logs fill with a 
> lot of :
> Apr 30 00:26:46 revproxy07 pound: error copy server cont: Connection reset by
peer
> 
> Unfortunately, the code is not very clear on what exactly causes the 
> error, so I did some patches to find the command which cause this. I 
> found out that the line causing the error is always the same, from 
> function copy_bin() :
>    95         if(BIO_flush(be) != 1)
> 
> Now, by reading the man page for BIO_flush(), I see that :
>         BIO_flush() normally writes out any internally buffered data, in some
>         cases it is used to signal EOF and that no more data will be written.
> 
> My question is therefore the following : shouldn't we use BIO_flush() 
> outside of the "while" loop ? I.e. shouldn't we flush output to the client 
> browser after all data have been written ?
> 
> Also, shouldn't we just ignore the return code of BIO_flush() ?
> 
> Anyway, the conclusion up to now is that some browsers, when receiving 
> some answers, close the connection instead of waiting for the server to 
> close connection. The next question is : which browser to blame ?

There's no browser to blame and no error. It's completely normal
behaviour, such as the user clicking the "stop" button or exiting the
browser before the response was (completely) received.

What you could do is to simply split your log by message type (traffic
is INFO, these messages are NOTICE, other problems are ERR/WARNING). See
the man page for syslog.conf(5) or syslogd(8).
-- 
Robert Segall
Apsis GmbH
Postfach, Uetikon am See, CH-8707
Tel: +41-44-920 4904


Re: [Pound Mailing List] error copy server cont: Connection reset by peer
Jean-Francois Stenuit <jfs(at)skynet.be>
2007-05-02 20:32:56 [ SNIP ]
On Wed, 2 May 2007, Robert Segall wrote:

> On Wed, 2007-05-02 at 09:54 +0200, Jean-Francois Stenuit wrote:
>> Hi list,
>>
>> I'm coming back with this old story. Apparently, there have been some
>> discussion about it on the list, but without a definitive answer.
>
> There's no browser to blame and no error. It's completely normal
> behaviour, such as the user clicking the "stop" button or exiting the
> browser before the response was (completely) received.

I would tend to agree with you if I come up with some sporadic messages, 
but I speak here about multiple messages per second, and almost 10% of the 
number of request and ... as strange as can be, only on some of the 
load-balanced pound servers. So I'm still convinced that there is 
something fishy with this configuration.

By the way, something I'm almost sure about : we shouldn't call 
BIO_flush() if we don't write anything. Is there a reason why the 
"if(!no_write)" only encompass the BIO_write() ?

> What you could do is to simply split your log by message type (traffic
> is INFO, these messages are NOTICE, other problems are ERR/WARNING). See
> the man page for syslog.conf(5) or syslogd(8).

-- 
Jean-François "Jef" Stenuit

Re: [Pound Mailing List] error copy server cont: Connection reset by peer
Robert Segall <roseg(at)apsis.ch>
2007-05-04 16:51:08 [ SNIP ]
On Wed, 2007-05-02 at 20:32 +0200, Jean-Francois Stenuit wrote:
> I would tend to agree with you if I come up with some sporadic messages, 
> but I speak here about multiple messages per second, and almost 10% of the 
> number of request and ... as strange as can be, only on some of the 
> load-balanced pound servers. So I'm still convinced that there is 
> something fishy with this configuration.

If so - what would be wrong? Have you looked at the router(s) logs?
Could it be that you get network congestion when several requests arrive
at the same time and thus the time-outs tend to bunch up?

> By the way, something I'm almost sure about : we shouldn't call 
> BIO_flush() if we don't write anything. Is there a reason why the 
> "if(!no_write)" only encompass the BIO_write() ?

Fair enough, it might be a bit more efficient. We'll change that in the
next version.
-- 
Robert Segall
Apsis GmbH
Postfach, Uetikon am See, CH-8707
Tel: +41-44-920 4904


MailBoxer