connect-go icon indicating copy to clipboard operation
connect-go copied to clipboard

HTTP middleware can't identify RPC paths

Open akshayjshah opened this issue 2 years ago • 1 comments

When using net/http middleware to wrap handlers and collect logs or metrics, it's nice to be able to parse out the protobuf package name, service name, and procedure name. However, it's potentially expensive to do this without an allowlist of known RPC endpoints: any script kiddie can generate zillions of invalid URLs, which may put a lot of load on observability systems (especially systems that treat each unique combination of tags as a timeseries).

@mattrobenolt brought this to our attention. Previously, he used grpc-go's generated ServiceDesc to build an allowlist. (This isn't what the gRPC folks want users to do with ServiceDesc, but that's beside the point.)

Today, users have a few options to solve this problem:

  • Emit logs & metrics from a connect interceptor. At that point, URLs have already been validated and Spec.Procedure is available. However, users may end up with one log line from HTTP middleware and another from a connect interceptor.
  • Use the generated service name constants to build a partial allowlist. This only validates the package.service portion of the URL, but that's likely enough to protect from most scanning attempts.
  • Write a separate plugin to generate an exact allowlist. This works, but is a bit of a pain.

If this is a common pain point, we could make this a bit easier by generating more code.

akshayjshah avatar Jul 18 '22 19:07 akshayjshah

How I worked around this, was using the Unimplemented*Handler struct and reflect. It's not possible, afaict, to reflect the actual interface generated, but fortunately, the Unimplemented struct also covers the entire interface, so we can leverage that.

package main

import (
	"fmt"
	"reflect"
)

type Thing interface {
	A()
	B()
}

type UnimplementedThing struct{}

func (UnimplementedThing) A() {}
func (UnimplementedThing) B() {}

var _ Thing = (*UnimplementedThing)(nil)

func extractMethods(t any) []string {
	t2 := reflect.TypeOf(t)
	methods := make([]string, 0, t2.NumMethod())
	for i := 0; i < t2.NumMethod(); i++ {
		methods = append(methods, t2.Method(i).Name)
	}
	return methods
}

func main() {
	fmt.Println(extractMethods((*UnimplementedThing)(nil)))
}

This is just modeled after the generated connect-go code, so this is relatively easy to extract the method names.

The only information this doesn't extract is what type of handlers these are. If they're streaming vs unary, which is sometimes useful and existed in the ServiceDesc from grpc-go. I can expand this to do more introspection and get the request/response types to determine what kinds they are.

mattrobenolt avatar Jul 19 '22 19:07 mattrobenolt

FWIW, this can also be accomplished using proto reflection instead of Go reflection.

This approach is a bit more general since it handles cases where the protoc plugin has to mangle the method name when generating Go code. It's the proto method name that is used in the URI path, not the name of the Go method. These can vary if the original source used a lower-case method name; in that case, the protoc plugin capitalizes it so that it is an exported symbol.

// import "google.golang.org/protobuf/reflect/protoreflect"
// import "google.golang.org/protobuf/reflect/protoregistry"

// can use generated service name constant here, e.g. userv1.UserServiceName
descriptor, err := protoregistry.GlobalFiles.FindDescriptorByName("acme.user.v1.UserService")
if err != nil { return err }
svcDesc := descriptor.(protoreflect.ServiceDescriptor)

methodNames := make([]string, svcDesc.Methods().Len())
for i := 0; i < svcDesc.Methods().Len(); i++ {
    methodNames[i] = string(svcDesc.Methods().Get(i).FullName())
}

jhump avatar Dec 16 '22 02:12 jhump

@jhump's got the right idea (@joshcarp has suggested a similar approach on other issues). I'm planning to document this approach and then close this issue.

akshayjshah avatar Dec 20 '22 07:12 akshayjshah