cpython icon indicating copy to clipboard operation
cpython copied to clipboard

BaseHTTPRequestHandler.parse_request() loses client-provided information

Open tipabu opened this issue 3 years ago • 1 comments

Bug report

The fix for https://github.com/python/cpython/issues/87389 prevents servers from handling request paths with multiple leading slashes.

For example, one might have a simple server that just reflects the request path:

from http.server import *

class MyHTTPRequestHandler(BaseHTTPRequestHandler):
    def do_GET(self):
        self.send_response(200)
        self.end_headers()
        self.wfile.write(self.path.encode('latin1'))

with ThreadingHTTPServer(('127.0.0.1', 8000), MyHTTPRequestHandler) as server:
    server.serve_forever()

Previously, this would faithfully mirror the request path from the client:

$ curl -v http://localhost:8000//test
*   Trying 127.0.0.1:8000...
* Connected to localhost (127.0.0.1) port 8000 (#0)
> GET //test HTTP/1.1
> Host: localhost:8000
> User-Agent: curl/7.82.0
> Accept: */*
> 
* Mark bundle as not supporting multiuse
* HTTP 1.0, assume close after body
< HTTP/1.0 200 OK
< Server: BaseHTTP/0.6 Python/3.8.13
< Date: Mon, 07 Nov 2022 21:51:19 GMT
< 
* Closing connection 0
//test

But now it mangles it:

$ curl -v http://localhost:8000//test
*   Trying 127.0.0.1:8000...
* Connected to localhost (127.0.0.1) port 8000 (#0)
> GET //test HTTP/1.1
> Host: localhost:8000
> User-Agent: curl/7.82.0
> Accept: */*
> 
* Mark bundle as not supporting multiuse
* HTTP 1.0, assume close after body
< HTTP/1.0 200 OK
< Server: BaseHTTP/0.6 Python/3.11.0
< Date: Mon, 07 Nov 2022 21:51:30 GMT
< 
* Closing connection 0
/test

This impacts any servers that subclass BaseHTTPRequestHandler, such as eventlet's WSGI server.

tipabu avatar Nov 07 '22 21:11 tipabu

FWIW, this also impacts wsgiref, which seems not-great considering how it's supposed to be a reference implementation. This example has the same bad behavior as above, except with a different Server header:

from wsgiref.simple_server import make_server

def app(env, start_response):
    start_response('200 OK', [])
    return [env['PATH_INFO'].encode('ascii')]

with make_server('', 8000, app) as httpd:
    httpd.serve_forever()

tipabu avatar Nov 08 '22 20:11 tipabu

The path mangling is not documented. The change in question seems to be commit defaa2b1. Perhaps it should have only been made in the SimpleHTTPRequestHandler class, not BaseHTTPRequestHandler.

vadmium avatar Apr 28 '24 05:04 vadmium