/ Zope / Apsis / Pound Mailing List / Archive / 2003 / 2003-11 / bad URL

[ << ] [ >> ]

[ Startup Problems / "Joel Johnston" ... ] [ Empty documents with directory indexing by thttpd ... ]

bad URL
"Claus Rosenberger" <Claus.Rosenberger(at)rocnet.de>
2003-11-25 17:10:32 [ FULL ]
hi,

i get the following error while using pound together with zope, but i
think, it's not a zope issue:

Nov 25 16:44:29 [pound] bad URL
"/rocnet/messages/manage_messages?msg=bmFwcHJfaGVhZGVy-0X1.EF99C08052EE1P-12&batch_start:int=0&batch_size:int=15&regex=&empty=&l

my pound.conf

UrlGroup        ".*"
HeadRequire     Host    ".*rocnet.de.*"
BackEnd         127.0.0.1,9080,1
EndGroup

All other request are fine, pound works but i got problems with these urls.

any ideas?

best regards

claus

Re: bad URL
Robert Segall <roseg(at)apsis.ch>
2003-11-25 18:15:16 [ FULL ]
On Tuesday 25 November 2003 17:10, Claus Rosenberger wrote:[...]

Use -current rather than 1.5, so that (by default) the URL checking is 
disabled.

Alternately add the character ':' to CSqid.[...]

Re: bad URL
"Claus Rosenberger" <Claus.Rosenberger(at)rocnet.de>
2003-11-25 19:03:16 [ FULL ]
> On Tuesday 25 November 2003 17:10, Claus Rosenberger wrote:[...][...]

thx, this helps, why the : is not a default? is it a error or is there a
special reason because it's not a default value?

Re: bad URL
Robert Segall <roseg(at)apsis.ch>
2003-11-25 22:38:27 [ FULL ]
On Tuesday 25 November 2003 19:03, Claus Rosenberger wrote:[...]

There some reasons and they are called RFCs. They define what characters are 
allowed/disallowed in URLs, and ':' is not one of them.

Some servers/applications use things beyond the standards - so Pound allows 
you to change its treatment of URLs to accomodate them.[...]

Re: bad URL
"Claus Rosenberger" <Claus.Rosenberger(at)rocnet.de>
2003-11-26 10:56:12 [ FULL ]
>[...]


hi,

i searched in the RFC and found following in RFC2396:

3.4. Query Component

   The query component is a string of information to be interpreted by
   the resource.

      query         = *uric

   Within a query component, the characters ";", "/", "?", ":", "(at)",
   "&", "=", "+", ",", and "$" are reserved.

The ":" should be allowed in a query string i think.

Claus

Re: bad URL
Robert Segall <roseg(at)apsis.ch>
2003-11-26 15:00:52 [ FULL ]
On Wednesday 26 November 2003 10:56, Claus Rosenberger wrote:[...]

The way I read it, "reserved" means "by default not allowed", and that is the 
Pound behaviour. You are always free to add them for your installation - for 
our part we try to deliver a secure default installation.

I would be interested in the group's opinion if this should be relaxed - and 
if yes to which extent.[...]

Re: bad URL
Todd Freeman <freeman(at)andrews.edu>
2003-11-26 15:26:51 [ FULL ]
From what I remember these characters are "reserved" as in they should
not be used for data in query strings. For example... many browsers
rewrite any space (' ') characters as '+'. For example:
key "something" with value 'foo and or bar' requesting index.cgi  has a get
string of: "index.cgi?something=foo+and+or+bar"

'&' and ';' are both now used as key=value delimiters.. ie:
index.cgi?foo=bar&bar=foo&something=2
or
index.cgi?foo=bar;bar=foo;something=2

are both valid get strings to pass those 3 keys and values.

The '?' character should only appear once between the uri and the
key=value elements

The ':' is reserved because it should only appear once between the
service type ('http','https','gopher' :} ) and the uri. This has fallen
somewhat out of being religiously followed though since modern browsers
and servers have MUCH better pattern matching and can handle the service
and URI splitting with ease.

The '(at)' was reserved because they wanted it to appear only in the email
addresses in 'mailto:' methods. Kindof silly but it did make parsing
easier.

The '/' character was reserved originally to designate directory
seperations... but that was in the day of 100% static pages... Most
people use the '/' in other places as well now... This is one place that
the RFC should be updated.

The ',' character was originally also reserved as a seperator like '&'
and ';' however I have never observed an implementation that worked with
that. I believe that is another area that the RFC is lacking an update.

I am rattling this from the top of my head... if someone has corrections
or further enlightenment I would love to hear it.

I know that we had to turn off the URL checking at our site because
developers had made VAST use of the '+' character to denote spaces...
A couple others were pains as well...

Just my 1.5 yen

On Wed, Nov 26, 2003 at 03:00:52PM +0100, Robert Segall wrote:[...]
[...]

Re: bad URL
Robert Segall <roseg(at)apsis.ch>
2003-11-26 16:01:23 [ FULL ]
On Wednesday 26 November 2003 15:13, Claus Rosenberger wrote:[...]

They are used for other things, such as '(at)' being a separator between user 
credentials and server, '?' and '&' separators between query elements, '=' 
separator between query id and value, '#' and ';' separators before 
parameter/fragment and so on.

For example if you define '&' to be allowed in your qval you can only have
one 
query parameter in your request. If your define '#' to be allowed you cannot 
have fragments (anchors) in your requests. If you define '?' in segment you 
can never have any parameters.
[...]

Please keep this on the list.[...]

Re: bad URL
Robert Segall <roseg(at)apsis.ch>
2003-11-26 16:18:26 [ FULL ]
On Wednesday 26 November 2003 15:26, Todd Freeman wrote:[...]

Correct
[...]

Not quite: ';' separates what the RFCs call 'parameter' from the 'segment'. 
Thus 'path_to_file;last_revision?x=y' is a valid request, 'x.cgi?x=y;a=b' is 
NOT. This is probably a VMS legacy...
[...]

Correct
[...]

Almost true (the ':' may also appear as a separator between user and password 
- https://joe:big_secret(at)www.server.com/x.html
is a valid URL). Still, we 
stick with the RFC.
[...]

Also on HTTP (and all other protocols in the family): 
'http://joe(at)www.server.com/x' is a valid URL
meaning 'get through HTTP 
file/resource x on server www.server.com as user joe'.
[...]

Possibly - but the fact that some people choose to ignore the RFC does not 
make it wrong. The fact that some companies ignore (euphemistically: 
'extend') the RFCs is not to be encouraged. Next thing you know your 
favourite browser does not work on a lot of sites...
[...]

Probably.
[...]

We have added '+' to the qval allowed values in -current.
[...]

That makes about 1.7 (US) cents? I know, nitpicking...[...]

Re: bad URL
Felix Buenemann <atmosfear(at)users.sourceforge.net>
2003-11-26 17:27:00 [ FULL ]
On Wednesday 26 November 2003 15:00, Robert Segall wrote:[...]

Hmm, I don't see much sense in enforcing this just for the sake of RFC, as we 
can see most (all?) modern webservers don't care about it.
Even if you have some oldtimer webserver, he would surely give his own error 
message on request he doesn't like, rejecting this in Pound is IMHO a bad 
idea. But as the code is already there you could just aswell make it optional 
and default is disabled
[...]

Re: bad URL
"Simon Matter" <simon.matter(at)ch.sauter-bc.com>
2003-11-26 23:28:36 [ FULL ]
> On Wednesday 26 November 2003 15:00, Robert Segall wrote:[...][...]

I don't agree with you. We're using pound mostly as part of our security
infrastructure. In this environment it makes perfect sense to restrict as
much as possible in a way that usually doesn't hurt anybody. We had a
number of webapps which didn't work correctly and all the problems have
been fixed by the app developer once I told them what our problem was with
it. From a security point of view it's good to enforce the RFC rules.

Simon

MailBoxer