/ Zope / Apsis / Pound Mailing List / Archive / 2009 / 2009-04 / Backend replication solution with pound?

[ << ] [ >> ]

[ number of trheads/process in pound / Emilio ... ] [ Re: Welcome to [Pound Mailing List] / Tim Dunphy ... ]

Backend replication solution with pound?
Sebastiaan van Erk <sebster(at)sebster.com>
2009-04-24 17:25:46 [ FULL ]
Hi,

I'm trying to make a replicated image repository. I want to have two 
servers with big disks with the images stored on both of them (purely 
for redundancy). When a new image is added to the repository it will 
generally appear on one of the two servers before it appears on the 
other (mostly by only a few seconds). Also, if one server crashes and 
comes back, the new images that were stored in the mean time are not yet 
available on both servers and will need to be rsynced. However, as soon 
as the images are stored on 1 of the 2 servers, they must become available.

The only read access to the images from the outside world is via HTTP. I 
was thinking of using pound to get the images from the backends, 
however, there is only the small problem of (temporary) failure when an 
image that is not yet on one of the two servers gets requested by pound 
on the server where it's missing. In that case I would like pound to try 
the other server (basically on a 404). Only if it gets a 404 from both 
servers is it a REAL 404.

In the final solution I want to be able to send the requests to a 
preferred server based on a header in the HTTP request, but from what I 
saw, that should be doable with pound.

The question I have is, can I do the "404 failover" with pound, or do I 
need to patch it? Is it easy to patch it for this use? Is this a good 
way to go about it, or are there better solutions for the above 
requirements?

Regards,
Sebastiaan
Attachments:  
smime.p7s application/x-pkcs7-signature 3328 Bytes

RE: [Pound Mailing List] Backend replication solution with pound?
Tony Howat <arhowat(at)hotmail.com>
2009-04-24 17:47:17 [ FULL ]
sebster(at)sebster.com asked :
 [...]


The proper solution for this would be shared storage for your web nodes, or
network partition mirroring using something like drbd.

 

--

Tony
[...]
Attachments:  
text.html text/html 1061 Bytes

Re: [Pound Mailing List] Backend replication solution with pound?
Dave Steinberg <dave(at)redterror.net>
2009-04-24 18:33:15 [ FULL ]
Tony Howat wrote:[...][...][...]

To answer the original question - pound doesn't have the feature you're 
looking for.  You might try varnish - I know it can do it - or something 
like nginx.

You might also consider something like squid as a layer between your 
load balancer and your backends.  Squid supports logic about getting 
documents from neighboring proxies (I forget what its called, its some 
proxy communications protocol).  You might be able to use this to give 
you a layer that hides details of your semi-consistent storage layer 
from your load balancer / the outside world.

Regards,[...]

Re: [Pound Mailing List] Backend replication solution with pound?
Sebastiaan van Erk <sebster(at)sebster.com>
2009-04-24 18:58:36 [ FULL ]
Hi,

Thanks for the reply.

Tony Howat wrote:[...][...][...]

Why is that the proper solution? There's a couple issues I have with 
both of these solutions (and I did think of them before coming up with 
the question above):

1) The point of the exercise is redundancy on disk level. This would 
immediately disqualify most shared file systems, because they're 
generally not redundant (with some exceptions of course, e.g., AFS, 
which is very complicated and management intensive). Solutions such as 
hardware mirroring/RAID tend to occur within one physical device so that 
if the main board of the device fails your disks are gone (and if not 
dead, then at least not accessible anymore at that moment).

2) Disk replication like drdb are not portable (i.e., I don't want to 
be tied to Linux), and they are also much more heavyweight and low level 
and complicated than I need. They need to support a much richer 
semantics than I need: all I need is read only access to entire images, 
i.e., HTTP GET. I don't need block level replication; I don't need 
advanced underlying filesystem/locking support, etc.

The HTTP level solution has many advantages that I can think of:

1) The interface is standard HTTP; I can use whatever operating system I 
want to implement this interface. I can put plain disks in the server, 
use RAID, or put racks with hard disks behind them. The interface to the 
application is not changed.

2) It is simple. All I need to do is make sure files appear in the web 
space atomically (e.g., write to a temp file on the same partition, and 
rename). If a file is on either of the servers, it's available from the 
"cluster". I can keep them in sync using rsync, bidirectionally, after a 
crash or an offline period. Generally I will try to write them to both 
disks, but if one is offline and I can't, the rsync will get it later, 
and meanwhile the "404 failover" combined with regular pound failover 
will still allow the image to be available.

3) It is robust and all the software used is open source and well 
supported. Other shared disk solutions such as NFS aren't jokingly 
called "network failure system" for nothing. I've had servers hang 
because the NFS mount became unavailable. Stuff like that just can't 
happen with this solution. The rsync recover is also simple and robust, 
instead of the file system inconsistencies you could get when recovering 
  block device replication. Also, you can use this solution in a 
"master-master" configuration, i.e., both servers serving data at all time.

If there is a better way to achieve what I want, then certainly I want 
to know. I saw something similar (mogilefs) to what I describe, but it's 
drawbacks (IMO) are that it needs a special client interface and is hard 
to backup (not a POSIX filesystem).

Regards,
Sebastiaan
[...]
Attachments:  
smime.p7s application/x-pkcs7-signature 3328 Bytes

Re: [Pound Mailing List] Backend replication solution with pound?
Sebastiaan van Erk <sebster(at)sebster.com>
2009-04-24 19:03:09 [ FULL ]
Hi,

Dave Steinberg wrote:[...][...]
>>> In the final solution I want to be able to send the requests to a 
>>> preferred server based on a header in the HTTP request, but from
what 
>>> I saw, that should be doable with pound.
>>>
>>> The question I have is, can I do the "404 failover" with pound, or
do 
>>> I need to patch it? Is it easy to patch it for this use? Is this a

>>> good way to go about it, or are there better solutions for the
above 
>>> requirements?[...][...]

Hadn't heard of varnish, will look at that, thanks for the tip. Nginx I 
have heard of, but didn't check if it could do this, will look at that too.
[...]

That also sounds like a good possibility.

Thanks for the leads!

Regards,
Sebastiaan
Attachments:  
smime.p7s application/x-pkcs7-signature 3328 Bytes

Re: [Pound Mailing List] Backend replication solution with pound?
Sebastiaan van Erk <sebster(at)sebster.com>
2009-04-27 16:52:33 [ FULL ]
Hi,

Dave Steinberg wrote:[...][...]
>>> In the final solution I want to be able to send the requests to a 
>>> preferred server based on a header in the HTTP request, but from
what 
>>> I saw, that should be doable with pound.
>>>
>>> The question I have is, can I do the "404 failover" with pound, or
do 
>>> I need to patch it? Is it easy to patch it for this use? Is this a

>>> good way to go about it, or are there better solutions for the
above 
>>> requirements?[...][...]

Just wanted to report back that using the combination of varnish as web 
accelerator (using VCL to do the "404 failover") and nginx as web 
server, I was able to implement this very easily, and it works perfectly.

Thanks again for the answer!

Regards,
Sebastiaan
[...]
Attachments:  
smime.p7s application/x-pkcs7-signature 3328 Bytes

MailBoxer