/ Zope / Apsis / Pound Mailing List / Archive / 2007 / 2007-07 / Re: [Pound Mailing List] RewriteLocation issues

[ << ] [ >> ]

[ RewriteLocation issues / "Martins Galenieks ... ] [ ANNOUNCE: Pound - reverse proxy and load balancer ... ]

Re: [Pound Mailing List] RewriteLocation issues
"Martins Galenieks [Aluminati]" <martins(at)aluminati.net>
2007-07-03 16:03:31 [ SNIP ]
Hello again,

I would like to report that it was needed to copy /etc/nsswitch.conf,
/lib/libresolv.so* and /lib/libns*.so* to the RootJail because of the last
directive enabled. Pound depended on the DNS lookups in order to
RewriteLocation successfully.

So problem does not lie between lines 1060 and 1070 at all, these are all
right.

Please regard this email as the ticket closing one. I apologise for the
false alarm.

Warm regards,
Martins

> Hi guys,
>
> I'm literally stuck for two days already in that I don't get proper
> rewritten Location: header from the Pound itself. I suspect the problem to
> be found exactly in the http.c file.
>
> Scheme deployed is:
> Browser->Pound(ListenHTTPS)->WebServer_HTTP->Pound->Browser
>
> Basically WebServer issues Location: http://v_host/loc_path which is sent
> to Pound
>
> But Pound does not rewrite Location: http://v_host/loc_path to the needed
> https://v_host/loc_path
>
> RewriteLocation does not function properly; it doesn't matter which value
> it takes from 0|1|2. It doesn't work also with the RewriteCond 0|1
> altogether as well however it might sound stupid but the thing is as it
> is.
>
> I have compiled Pound-2.3.2 using the following instructions in the source
> directory with really stable instructions:
> CFLAGS=" -O2 -pipe " ./configure --prefix=/usr --infodir=/usr/share/info
> --mandir=/usr/share/man --with-ssl=/usr --with-owner=poundweb
> --with-group=poundweb
>
> There are no problems with any of the system libraries coming from the
> `ldd /usr/sbin/pound` as well.
>
> Would the problem lie in http.c, lines between 1060 and 1070?
>
> Or could there be just another problem I could not spot on?
>
>
> --
> To unsubscribe send an email with subject 'unsubscribe' to pound(at)apsis.ch.
> Please contact roseg(at)apsis.ch for questions.
> http://www.apsis.ch/pound/pound_list/archive/2007/2007-06/1183138984000
>



Re: [Pound Mailing List] question about https redirect
cosmih <cosmih(at)gmail.com>
2007-07-04 11:59:08 [ SNIP ]
in the end ... can i do that ?

i can use rinetd for port redirect but this way i lose the IP source ....
and my web application use this IP source for some sort of filters and for
some sort of authentification

regards,
mihai


On 6/28/07, cosmih <cosmih(at)gmail.com> wrote:
>
>
>
> > Why not? Is anything listening on .232?
> >
>
>
> why not what ?
>
> why not "
> https://172.16.20.232/somedir/somefile.html?ART1=val1&ART2=val2&ART3=val3"
> it is not working ?
>
> on 172.16.20.232 machine is  apache daemon running  and it listening on
> this IP  only on 443 port
>
> why it is not working ? because on this apache it is an vhost configured
> and only for secure.myhost.com apache repond to the querys
>
> moreover the SSL Certificate from verisign is only for secure.myhost.comhost
and if i make apache  respond to the querys who contain the IP in url
> the browser it warn about the difference between the CN attribute from
> certificare and the host from url
>
> and i want to do that redirect to be unobserved by the client browser
>
>
> regards,
> mihai
> <https://172.16.20.232/somedir/somefile.html?ART1=val1&ART2=val2&ART3=val3>

Attachments:  
text.html text/html 2332 Bytes

Re: [Pound Mailing List] unexpected chunked EOF: Connection timed out
Robert Segall <roseg(at)apsis.ch>
2007-07-04 15:35:57 [ SNIP ]
On Thu, 2007-06-28 at 12:46 -0700, Ed Sawicki wrote:
> I setup two back ends that serve the identical content:
> 
> 1.publicfile Web server - always returns chunked
>    responses for any HTTP/1.1 request.
> 
> 2.thttpd Web server - always returns a Content-Length:
>    header for HTTP/1.1 requests for static content (does
>    not use chunked encoding)
> 
> When I use the publicfile back end, I frequently see the
> above mentioned errors in the pound logs and the browser
> user experiences delays.
> 
> When I use the thttpd back end, there are no errors and
> everything works as expected.
> 
> Is there something more I can do to characterize the
> problem?
> 
> Ed

Thanks for your help Ed. 2.4b had indeed introduced a problem with
chunked transfer-encoding. This should be fixed in the just-released
2.4c - give it a try and let me know.
-- 
Robert Segall
Apsis GmbH
Postfach, Uetikon am See, CH-8707
Tel: +41-44-920 4904


Re: [Pound Mailing List] anti-scraping techonlogy
Robert Segall <roseg(at)apsis.ch>
2007-07-04 15:38:58 [ SNIP ]
On Wed, 2007-06-27 at 16:51 -0400, Albert wrote:
> This question has nothing to do with pound, and I apologize for posting 
> it here, but I was wondering if anybody could recommend anti-scrapping 
> technology?  We've been seeing some scrapping done on our site, and 
> would like an ability to slow or block such crawlers (without manual 
> intervention).
> 
> Thanks,
> 
> Albert

I guess that depends on how you can identify the crawlers. I expect you
already looked into the simple answers (firewall for source addresses,
HeadDeny for header-based identification, traffic shaping for many quick
requests from the same source address). Is there anything I missed?
-- 
Robert Segall
Apsis GmbH
Postfach, Uetikon am See, CH-8707
Tel: +41-44-920 4904


Re: [Pound Mailing List] unexpected chunked EOF: Connection timed out
Ed Sawicki <ed(at)alcpress.com>
2007-07-04 19:36:02 [ SNIP ]
Robert Segall wrote:
> On Thu, 2007-06-28 at 12:46 -0700, Ed Sawicki wrote:
>> I setup two back ends that serve the identical content:
>>
>> 1.publicfile Web server - always returns chunked
>>    responses for any HTTP/1.1 request.
>>
>> 2.thttpd Web server - always returns a Content-Length:
>>    header for HTTP/1.1 requests for static content (does
>>    not use chunked encoding)
>>
>> When I use the publicfile back end, I frequently see the
>> above mentioned errors in the pound logs and the browser
>> user experiences delays.
>>
>> When I use the thttpd back end, there are no errors and
>> everything works as expected.
>>
>> Is there something more I can do to characterize the
>> problem?
>>
>> Ed
> 
> Thanks for your help Ed. 2.4b had indeed introduced a problem with
> chunked transfer-encoding. This should be fixed in the just-released
> 2.4c - give it a try and let me know.

It works!

I was using version 2.3.2, so it's been broken for a while
I guess.

I'm an author and I cover pound in my Guide to Apache book:

http://apachebook.alcpress.com

which is used in college courses. I'm expanding the Pound
section in the Second Edition so it's nice that the chunked
transfer encoding problem is fixed.

Thank you,
Ed Sawicki


Re: [Pound Mailing List] anti-scraping techonlogy
Albert <pound(at)alacra.com>
2007-07-04 23:14:26 [ SNIP ]


> I guess that depends on how you can identify the crawlers. I expect you
> already looked into the simple answers (firewall for source addresses,
> HeadDeny for header-based identification, traffic shaping for many quick
> requests from the same source address). Is there anything I missed?
>   
Yes we have, and there is no simple answer.  We're planning to limit 
number of page views any C-class IP address can have for a period of 
time.  But the problem is with scrappers using ISPs.  The other issues 
is real crawlers, we don't want to limit number of pages a crawler sees, 
so in essence we need to create an "authorized" list of IP addresses, 
without any limitations.

Some scrappers tone down their bots to do page requests every few 
seconds (mimicking human like speed).  Other scrappers seems to employ 
cheap human labor, anywhere from India to China to Africa, so you have 
somebody who looks like browsing your data, but really scrapping.

One idea I had would be to allow pound to call a plugin.  But this would 
make pound more complicated, and slower, not to mention the plugin will 
need to be implemented. 

We'll be trying to tackle this problem in-house in the new few weeks, 
and I'll share our experiences.

Albert


Re: [Pound Mailing List] IP based blocking
James Evans <jevans(at)telesage.com>
2007-07-11 22:29:47 [ SNIP ]
Would you mind sharing your ISAPI filter or give a pointer in the right 
direction for writing one?

Thanks!

Albert wrote:
> The pound homepage describes how you can filter based on IP rules.
> 
> We ran into a similar issue here, and we wrote a small ISAPI filter for 
> IIS that restricts access (and in some cases expands access) based on 
> the IP address.
> 
> James Evans wrote:
>> Is there a way to make pound restrict access to certain directories 
>> based on IP rules? I used to restrict with IIS but now it sees all 
>> requests coming from the pound proxy.
>>
>> Thanks!
>>
>>
> 
> 

Attachments:  
smime.p7s application/x-pkcs7-signature 3245 Bytes

Re: [Pound Mailing List] mod_rewrite
Dave Steinberg <dave(at)redterror.net>
2007-07-28 16:12:53 [ SNIP ]
Leonard Bethea wrote:
> I have tried several mod_rewrite conditions to change all http to https. It
> works well when using just the internal server ip's but when trying to
> redirect back through pound it fails. My mod_rewrite script is RewriteEngine
> on
> 
> RewriteCond %{HTTPS} on
> 
> RewriteRule (.*) https://%{SERVER_ADDR}%{REQUEST_URI}
> 
>  
> 
> From my access logs it appears to continue to loop. What is the problem with
> this syntax for pound?

Since pound speaks HTTP only to the backend, all traffic appears to be 
HTTP traffic to Apache.  You'll want to use a header as your indicator. 
  Something like this works:

===
RewriteEngine On
RewriteCond %{HTTP:X-Forwarded-Proto} =https 
 

RewriteRule .* - [L]

# otherwise redirect to SSL
RewriteRule .* https://www.example.com%{REQUEST_URI} [R,L]
===

Be sure you're adding the 'X-Forwarded-Proto: https' header in pound.

Regards,
-- 
Dave Steinberg
http://www.geekisp.com/
http://www.steinbergcomputing.com/

RE: [Pound Mailing List] mod_rewrite
"Leonard Bethea" <lbethea(at)aastest.com>
2007-07-29 00:27:00 [ SNIP ]
Thanks Dave,

I was able to get the redirect to work. I tweaked your code a little and
added the header. If it's a SSL cert then everything works now. If the
header is not https then I redirect to the secure link; however, the
redirect does not pass the form information. I was wondering if this was
because of mod_rewrite or pound?

-----Original Message-----
From: Dave Steinberg [mailto:dave(at)redterror.net] 
Sent: Saturday, July 28, 2007 10:13 AM
To: pound(at)apsis.ch
Subject: Re: [Pound Mailing List] mod_rewrite

Leonard Bethea wrote:
> I have tried several mod_rewrite conditions to change all http to https.
It
> works well when using just the internal server ip's but when trying to
> redirect back through pound it fails. My mod_rewrite script is
RewriteEngine
> on
> 
> RewriteCond %{HTTPS} on
> 
> RewriteRule (.*) https://%{SERVER_ADDR}%{REQUEST_URI}
> 
>  
> 
> From my access logs it appears to continue to loop. What is the problem
with
> this syntax for pound?

Since pound speaks HTTP only to the backend, all traffic appears to be 
HTTP traffic to Apache.  You'll want to use a header as your indicator. 
  Something like this works:

===
RewriteEngine On
RewriteCond %{HTTP:X-Forwarded-Proto} =https 
 

RewriteRule .* - [L]

# otherwise redirect to SSL
RewriteRule .* https://www.example.com%{REQUEST_URI} [R,L]
===

Be sure you're adding the 'X-Forwarded-Proto: https' header in pound.

Regards,
-- 
Dave Steinberg
http://www.geekisp.com/
http://www.steinbergcomputing.com/

--
To unsubscribe send an email with subject unsubscribe to pound(at)apsis.ch.
Please contact roseg(at)apsis.ch for questions.


Re: [Pound Mailing List] mod_rewrite
Dave Steinberg <dave(at)redterror.net>
2007-07-29 00:48:38 [ SNIP ]
Leonard Bethea wrote:
> Thanks Dave,
> 
> I was able to get the redirect to work. I tweaked your code a little and
> added the header. If it's a SSL cert then everything works now. If the
> header is not https then I redirect to the secure link; however, the
> redirect does not pass the form information. I was wondering if this was
> because of mod_rewrite or pound?

Neither - if your application is submitting a form to a HTTP url and you 
expect to receive variables, the redirector is going to need to turn the 
POST variables into the query string of the redirect URL (thereby 
turning them into "GET variables".  In this case you're looking at a 
custom redirector rather than using mod_rewrite for it - at least if I'm 
understanding you correctly.

Naturally I'd recommend avoiding this if possible, as any large forms 
won't be too happy.

Regards,
-- 
Dave Steinberg
http://www.geekisp.com/
http://www.steinbergcomputing.com/

MailBoxer