go icon indicating copy to clipboard operation
go copied to clipboard

net/http/httputil: ReverseProxy appends trailing slash to url with empty path

Open nkreiger opened this issue 3 years ago • 9 comments

What version of Go are you using (go version)?

$ go version
go version go1.18beta1 darwin/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE="on"
GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/noahkreiger/Library/Caches/go-build"
GOENV="/Users/noahkreiger/Library/Application Support/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOINSECURE=""
GOMODCACHE="/Users/noahkreiger/go/pkg/mod"
GONOPROXY="github.com/nkreiger"
GONOSUMDB="github.com/nkreiger"
GOOS="darwin"
GOPATH="/Users/noahkreiger/go"
GOPRIVATE="github.com/nkreiger"
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/Users/noahkreiger/go/go1.18beta1"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/Users/noahkreiger/go/go1.18beta1/pkg/tool/darwin_amd64"
GOVCS=""
GOVERSION="go1.18beta1"
GCCGO="gccgo"
GOAMD64="v1"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD="/dev/null"
GOWORK=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS="-Wno-error -Wno-nullability-completeness -Wno-expansion-to-defined -Wbuiltin-requires-header"
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -arch x86_64 -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/4w/ymct9nsd21j2tmf_8k7csf340000gn/T/go-build285603268=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

Attempted to serve a reverse proxy where the base URL had a RawPath and Path value of "", an empty string.

// Serve a reverse proxy for a given url
func serveReverseProxy(target string, res http.ResponseWriter, req *http.Request) {
	// parse the url
	url, _ := url.Parse(target)

	// create the reverse proxy
	proxy := httputil.NewSingleHostReverseProxy(url)

	// Note that ServeHttp is non blocking and uses a go routine under the hood
	proxy.ServeHTTP(res, req)
}

target = "http://localhost:808/health"

req.URL.Path = "" req.URL.RawPath = ""

What did you expect to see?

I expected the new single host reverse proxy to rout the request to http://locahost:8080/health

What did you see instead?

It was routed to http://localhost:8080/health/ which causes the route to throw a 404 Not Found because of the extra suffix "/"

Potential Easy Fix

https://github.com/golang/go/blob/b357b05b70d2b8c4988ac2a27f2af176e7a09e1b/src/net/http/httputil/reverseproxy.go#L111

it this line to be

case !aslash && !bslash:
		if b == "" {
			return a
		}
		return a + "/" + b
	}

so if the base path is empty, it just returns the new path passed in instead of appending an extra slash...

nkreiger avatar Dec 24 '21 16:12 nkreiger

Change https://golang.org/cl/374276 mentions this issue: net/http/httputil: This change modifies Go to not add trailing slash on a direct reverse proxy Fixes #50337

gopherbot avatar Dec 24 '21 16:12 gopherbot

@neild per owners.

thanm avatar Dec 28 '21 16:12 thanm

Same here, I got 404 because of extra suffix "/"

septiajio avatar Jul 26 '22 07:07 septiajio

Is it valid per spec to have an empty url path though? I thought the resource being requested always has to be well-defined, so if you have

GET / HTTP/2

then you'd have path as /, with no way for a client to send a request with it being empty. See the following part of RFC 2616:

The most common form of Request-URI is that used to identify a
   resource on an origin server or gateway. In this case the absolute
   path of the URI MUST be transmitted (see [section 3.2.1](https://www.rfc-editor.org/rfc/rfc2616#section-3.2.1), abs_path) as
   the Request-URI, and the network location of the URI (authority) MUST
   be transmitted in a Host header field. For example, a client wishing
   to retrieve the resource above directly from the origin server would
   create a TCP connection to port 80 of the host "www.w3.org" and send
   the lines:

       GET /pub/WWW/TheProject.html HTTP/1.1
       Host: www.w3.org

   followed by the remainder of the Request. Note that the absolute path
   cannot be empty; if none is present in the original URI, it MUST be
   given as "/" (the server root).

krackers avatar Mar 20 '23 00:03 krackers

@krackers has a point.

Maybe modify reverseproxy.go:singleJoiningSlash as follows (comments added):

func singleJoiningSlash(a, b string) string {
	aslash := strings.HasSuffix(a, "/")
	bslash := strings.HasPrefix(b, "/")
	switch {
	case aslash && bslash:
		return a + b[1:] // avoid doubled slashes when both are present
	case !aslash && !bslash: // neither has a slash, make sure there is one, or at least the result is non-blank
		if b=="" {
		    if a=="" {
		        return "/" // if both are blank, return a slash
		    }
		    return a // avoid adding a trailing slash if b is blank
		}
		return a + "/" + b // both non-blank, separate with slash
	}
	return a + b // only one of a or b has a trailing/leading slash, just concatenate
}

The reverseproxy.go code is a bit strange. I've quoted it below and added //??? comments where I want to editorialize.

func joinURLPath(a, b *url.URL) (path, rawpath string) {
	if a.RawPath == "" && b.RawPath == "" { 
                //??? this is the only place singleJoiningSlash is called (other than in tests)
		return singleJoiningSlash(a.Path, b.Path), ""
 	}
	// Same as singleJoiningSlash, but uses EscapedPath to determine
	// whether a slash should be added
	apath := a.EscapedPath()
	bpath := b.EscapedPath()

	aslash := strings.HasSuffix(apath, "/")
	bslash := strings.HasPrefix(bpath, "/")

	switch {
	case aslash && bslash:
		return a.Path + b.Path[1:], apath + bpath[1:]
	case !aslash && !bslash:
                //??? do we have the same issue here? Does it matter?
		return a.Path + "/" + b.Path, apath + "/" + bpath
	}
	return a.Path + b.Path, apath + bpath
}

tmalaher-telus avatar May 18 '23 21:05 tmalaher-telus

I was actually meaning that since it's not valid to have a GET request with empty resource being requested, we shouldn't need to handle this case in httputil itself.

As I understand, the api of reverseproxy is really meant to route paths under a directory-like structure. So if you reverse proxy to http://locahost:8080 and request /health then the request made is for http://locahost:8080/health. Likewise if you route to http://locahost:8080/health and request /foo then the request is for http://locahost:8080/health/foo.

Now OP's situation is that if you route to http://locahost:8080/health and then request /, then the request is for http://locahost:8080/health/ which may not be defined. However think this case should be handled with a custom handler func in your code, rather than trying to hack an empty-path to signal that a trailing slash should not be appended (since an empty-path is invalid per spec).

krackers avatar May 20 '23 21:05 krackers

Hi,

This is my solution:

`func serveReverseProxy(target string, res http.ResponseWriter, req *http.Request) {

url, _ := url.Parse(target)

req.URL = url    // <----------  add this
url.Path = ""   // <----------  add this

proxy := httputil.NewSingleHostReverseProxy(url)

proxy.ServeHTTP(res, req)

}`

nicolasopt avatar Oct 13 '23 09:10 nicolasopt

As @krackers says: An HTTP request URL can't contain an empty path. Should we be adding a special case to handle one, given that our existing behavior doesn't seem particularly wrong?

neild avatar Apr 19 '24 00:04 neild

Change https://go.dev/cl/595695 mentions this issue: net/http/httputil: fix joinURLPath unexpectedly appends a trailing slash

gopherbot avatar Jun 28 '24 10:06 gopherbot

Change https://go.dev/cl/658536 mentions this issue: net/http/httputil: document ProxyRequest.SetURL limitations

gopherbot avatar Mar 17 '25 19:03 gopherbot