pando.py icon indicating copy to clipboard operation
pando.py copied to clipboard

// gives homepage

Open chadwhitacre opened this issue 11 years ago • 21 comments

After running make doc, I would expect http://localhost:5370// to give me a 404, but instead it gives me the homepage.

chadwhitacre avatar Apr 02 '13 02:04 chadwhitacre

Some websites do that. I wouldn't say that this is a huge issue.

sigmavirus24 avatar Apr 02 '13 14:04 sigmavirus24

So, two conflicting takes on this:

  1. RFC 2396 says that URI path separator is a single slash.
  2. POSIX path definition says that "Multiple successive slashes are considered to be the same as one slash".

Since we stand at an intersection of the two, I vote that we go with whatever's implemented, which at the moment seems to be the second option.

pjz avatar Apr 17 '13 19:04 pjz

Interesting. A twist: someone registered the empty string as their username on Gittip. I would expect this to show me their profile:

https://www.gittip.com//

chadwhitacre avatar Apr 18 '13 19:04 chadwhitacre

Treating double slashes as a single slash in URLs is very common practice. I've seen it in a bunch of places (mostly from bad code requesting URLs with repeated slashes and never getting fixed because it didn't break anything). It appears to be the default behavior of both Apache and Nginx and I'm pretty sure I've seen this behavior from IIS (I can't think of a quick way I can check that right now).

Playing with URLs in my existing browser tabs, Google, DuckDuckGo and others ignore the repeated slashes, Github gives a 404.

bruceadams avatar Jun 01 '13 11:06 bruceadams

Okay, so the issue I see with this is: what if there are files: /.spt and /index.html.spt , and someone hits / ? which do they get? is there an implied empty string after every / ? and it overrides the fallback paths? Is .spt a valid filename? (note that it will be a 'hidden' file under unix) This seems orthogonal to the issue of // vs /, but I think it's related enough that if we answer it, it might give us a clue how to answer the // vs / problem.

pjz avatar Jun 04 '13 13:06 pjz

It appears that Flask also treats multiple successive slashes as a single slash, for what it's worth.

lyndsysimon avatar Jun 04 '13 13:06 lyndsysimon

I think this is fine as-is. If you really want to differentiate you can make a wildcard sptfile.

pjz avatar Jun 20 '13 15:06 pjz

Doesn't seem right to me. http://www.example.com// should be 404.

chadwhitacre avatar Jun 20 '13 15:06 chadwhitacre

I want https://www.gittip.com// to match %username with path['username'] set to ''.

chadwhitacre avatar Jun 20 '13 15:06 chadwhitacre

Because some schmoe changed their username on Gittip to the empty string, and I want to be like, "Sure! Go ahead!" :-)

chadwhitacre avatar Jun 20 '13 15:06 chadwhitacre

what if there are files: /.spt and /index.html.spt , and someone hits / ? which do they get? is there an implied empty string after every / ? and it overrides the fallback paths? Is .spt a valid filename? (note that it will be a 'hidden' file under unix)

pjz avatar Jun 20 '13 15:06 pjz

If the only way to 'catch' that kind of filename is with a wildcard, I think we shouldn't do it.

pjz avatar Jun 20 '13 15:06 pjz

is there an implied empty string after every / ?

No, there's an actual empty string between every //. :-)

chadwhitacre avatar Jun 20 '13 16:06 chadwhitacre

Treating // as something different than / goes against defacto standards on the web. I'd love to find an RFC that speaks to this. (I have not yet found one.)

bruceadams avatar Jun 21 '13 04:06 bruceadams

Found this slightly-related reference while working on #195:

The "/" character may be used within HTTP to designate a hierarchical structure.

http://www.ietf.org/rfc/rfc1738.txt

chadwhitacre avatar Jun 21 '13 16:06 chadwhitacre

Also see "HIERARCHICAL FORMS" in http://www.ietf.org/rfc/rfc1630.txt:

      The slash ("/", ASCII 2F hex) character is reserved for the
      delimiting of substrings whose relationship is hierarchical.  This
      enables partial forms of the URI.  Substrings consisting of single
      or double dots ("." or "..") are similarly reserved.

      The significance of the slash between two segments is that the
      segment of the path to the left is more significant than the
      segment of the path to the right.  ("Significance" in this case
      refers solely to closeness to the root of the hierarchical
      structure and makes no value judgement!)

      Note

         The similarity to unix and other disk operating system filename
         conventions should be taken as purely coincidental, and should
         not be taken to indicate that URIs should be interpreted as
         file names.

chadwhitacre avatar Jun 21 '13 16:06 chadwhitacre

sigh write me some failing tests into an issue170 branch and I'll see about making the dispatcher work correctly.

pjz avatar Jun 21 '13 17:06 pjz

...so if autoindex is on, should a request for // give 404? or give the autoindex('//') -> autoindex('/') ? There's no possible way to make it give anything else without a wildcard simplate (%foo.spt), as you can't make a directory with an empty-string name.

And what if there are files: /.spt and /index.html.spt, and someone hits / ? which do they get? Logically I think /.spt would override the index.html.spt since the latter is a 'fallback' and the former is 'more particular'.

Is .spt a valid filename? (note that it will be a 'hidden' file under unix)

Also, since aspen mimics the filesystem mostly, I suspect people will be surprised when http://example.com/foo/bar != http://example.com/foo//bar

pjz avatar Aug 14 '15 14:08 pjz

@whit537 ping. Design opinions needed.

pjz avatar Oct 26 '15 18:10 pjz

Another example: https://gratipay.com/about//stats is currently 404.

chadwhitacre avatar Mar 08 '16 15:03 chadwhitacre

Discussing on https://github.com/AspenWeb/salon/issues/8 (at about 40 minutes?) ... let's redirect // to / in an algorithm function.

chadwhitacre avatar Mar 08 '16 15:03 chadwhitacre