bottle Router unable to match wildcard filter in the middle of a URL

I'm trying to use the :path wildcard filter (as described here) to match part of my URL, which includes a forward slash character. For example, if I have the URL:

/resources/adfs89s7/container/asdf%2Fasdf/items

(where %2F is the forward slash), I want to match it to the route:

/resources/<resource_id>/container/<container_name:path>/items

However, this is currently returning a Not Found error. I have similar URLs where the wildcard filter is at the end of the URL, e.g.

/resources/<resource_id>/container/<container_name:path>

and that seems to work fine.

Mar 04 '15 03:03 claire-lee

The two strings %2F and / are equivalent in an URI path. You can encode any character this way. a equals %61 for example.

Jun 14 '15 17:06 defnull

@claire-lee What you are trying is probably better suited for regexes, because, afaik, :path wildcard will consume the rest of the path including escaped and unescaped slashes.

Jun 15 '15 12:06 foxbunny

Oh, the :path filter works just fine. I cannot reproduce the error.

>>> import bottle
>>> app = bottle.Bottle()
>>> app.route('/resources/<resource_id>/container/<container_name:path>/items', callback=True)
>>> app.match(dict(PATH_INFO='/resources/adfs89s7/container/asdf%2Fasdf/items', REQUEST_METHOD='GET'))
(..., {'resource_id': 'adfs89s7', 'container_name': 'asdf%2Fasdf'})
>>> app.match(dict(PATH_INFO='/resources/adfs89s7/container/asdf/asdf/items', REQUEST_METHOD='GET'))
(..., {'resource_id': 'adfs89s7', 'container_name': 'asdf/asdf'})

Jun 15 '15 12:06 defnull

The follow will 404 when using %2F, but works fine with a literal forward slash

http://something.com/test/123/asd -> 123/asd http://something.com/test/123%2Fasd -> 404

@route("/test/<test:re:.+>", method='GET')
def test(test):
    return test

May 19 '16 03:05 tresni

I still cannot reproduce this bug in master or release-v12:

>>> import bottle
>>> app = bottle.Bottle()
>>> app.route('/test/<test:re:.+>', callback=True)
True
>>> app.match(dict(PATH_INFO='/test/123/asd', REQUEST_METHOD='GET'))
(<GET '/test/<test:re:.+>' True>, {'test': '123/asd'})
>>> app.match(dict(PATH_INFO='/test/123%2Fasd', REQUEST_METHOD='GET'))
(<GET '/test/<test:re:.+>' True>, {'test': '123%2Fasd'})

I tried the exact script you posted (plus import statements and a run() at the end). It works as intended:

$ curl http://127.0.0.1:8080/test/123/asd
123/asd
$ curl http://127.0.0.1:8080/test/123%2Fasd
123/asd

😕

May 19 '16 08:05 defnull

I'm facing the same issue. As an explanation, URI encoding the slash should have the effect that it is not interpreted as a hierarchical component in contrast to a literal forward slash. The point of encoding is to remove the semantic meaning. That is I would expect that:

/test/123%2Fasd refers to the document 123%2Fasd located at test. As such, it should not match the route /test/123/:doc.
/test/123/asd refers to the document asd located at /test/123. As such, it should match the route /test/123/:doc.

However given this service:

def print_all(*args, **kwargs): print(*args, kwargs)

import bottle
app = bottle.Bottle()
app.get("/test/123/:doc")(print_all)
app.run()

curl -o http://localhost:8080/test/123%2Fasd
# Server log: {'doc': 'asd'}, should return a 404
curl -o http://localhost:8080/test/123/asd
# Server log: {'doc': 'asd'}

It seems that the URI-encoding is already removed prior to the call to match hence those are both matched as /test/123/asd. This would explain all observations:

match and route when called with encoded route information work as intended and do not match the still encoded /%2F.
a running app will never really call those methods with the anticipated parameters.

How this can be resolved is a difficult question. It certainly makes some sense to do URL decoding after having isolated the pure path component as this makes it far more ergonomic to match paths with special characters such as spaces or question marks. And indeed matching /test/doc?cheeky would match a document named doc?cheeky (encoded as /test/doc%3Fcheeky) and not a get with a query. The handler should most definitely get the URI decoded path components as well. However this scheme means that / is always semantically interpreted even if it shouldn't and there is no way to escape it for the client.

Aug 12 '20 15:08 HeroicKatora

bottle bottle copied to clipboard

Router unable to match wildcard filter in the middle of a URL

bottle
bottle copied to clipboard