peps
peps copied to clipboard
PEP 3333: URL-decoding and routing
There have been several discussions over the years about how WSGI frameworks apply routing in the case of URL-encoded path components. Here's a comment that links to a few of the discussions:
https://github.com/encode/starlette/pull/1828#issuecomment-1434043248
The issue is that when using routing features, the URL /user/foo/edit
appears to be indistinguishable from /user/foo%2Fedit
, and in case of routing rules for /user/{username}
and /user/{username}/edit
, the URL /user/foo%2Fedit
will match the second rule and not the first.
This seems to me to be in conflict with RFC 3986 section 2.4 and RFC 3986 section 2.2. The latter states:
URIs that differ in the replacement of a reserved character with its corresponding percent-encoded octet are not equivalent.
But the routing mechanism in popular WSGI frameworks such as Flask and FastAPI are unable to differentiate the mentioned URLs, because the URI's percent-encoding is decoded before the routing happens.
I believe PEP-3333 needs to clarify the correct behaviour.
ددد
There have been several discussions over the years about how WSGI frameworks apply routing in the case of URL-encoded path components. Here's a comment that links to a few of the discussions:
encode/starlette#1828 (comment)
The issue is that when using routing features, the URL
/user/foo/edit
appears to be indistinguishable from/user/foo%2Fedit
, and in case of routing rules for/user/{username}
and/user/{username}/edit
, the URL/user/foo%2Fedit
will match the second rule and not the first.This seems to me to be in conflict with RFC 3986 section 2.4 and RFC 3986 section 2.2. The latter states:
URIs that differ in the replacement of a reserved character with its corresponding percent-encoded octet are not equivalent.
But the routing mechanism in popular WSGI frameworks such as Flask and FastAPI are unable to differentiate the mentioned URLs, because the URI's percent-encoding is decoded before the routing happens.
I believe PEP-3333 needs to clarify the correct behaviour.
https://github.com/python/peps/issues/3280#issue-1840658032
This issue was discussed in detail in 2008
[Web-SIG] WSGI Amendments thoughts: the horror of charsets https://www.mail-archive.com/[email protected]/msg02483.html
It's a complex situation, I think that there is no straightforward answer.
I think it likely that the only way that the WSGI spec will change in relation to this issue is if there is a specific change proposed.