grpc-go gRPC-Go should drop received headers containing upper ASCII (0x80+) characters

A have a question about a new version: v1.46.0 released on Apr 22. The release adds a client-side validation of HTTP-invalid metadata before attempting to send (PR #4886).

One of the validation rules states that the gRPC request header value must contain one or more characters from the set [%x20-%x7E] (ref code) effectively excluding non-printable characters.

As a result, in some cases, bumping the package from v1.45 to 1.46 may become a breaking change whenever header values pass on names and localized content (e.g. : Łukasz, Renée, Noël).

Was that an oversight or fully intentional to comply with RFC7230?

Apr 26 '22 09:04 marooou

The change was intended to bring us in line with the HTTP spec, but I think there is a bug in the enforcement. RFC7230 says:

     quoted-string  = DQUOTE *( qdtext / quoted-pair ) DQUOTE
     qdtext         = HTAB / SP /%x21 / %x23-5B / %x5D-7E / obs-text
     obs-text       = %x80-FF

0x80-FF seems to be there to allow for unicode, but we are rejecting it.

Apr 26 '22 17:04 dfawley

@dfawley Sounds reasonable to fix, I think. Is there anything I can help with to have that fixed?

Apr 26 '22 18:04 marchmiel

I did a little more digging into this. It looks like the gRPC spec itself forbids this:

https://github.com/grpc/grpc/blob/master/doc/PROTOCOL-HTTP2.md#requests

Custom-Metadata → Binary-Header / ASCII-Header ASCII-Header → Header-Name ASCII-Value ASCII-Value → 1*( %x20-%x7E ) ; space and printable ASCII

Also, gRPC-Java for sure will have trouble with these values (I'm not sure about C++/other implementations). The only reason it worked for go->go servers and clients is that both implementations assumed UTF-8, which would not be the case in other languages.

For this reason, I believe it is a bug that we were allowing these to be used before, and that the validation, as added is mostly working as intended. I do apologize for the breakage, though, which was unexpected because we did not realize that the underlying HTTP framer implementation was accepting unicode values.

Instead of transmitting the strings directly, it is recommended to percent-encode them first (and decode them on the receiving side).

However, we do still have a bug. The spec goes on to say:

The ASCII-Value character range defined is more strict than HTTP. Implementations must not error due to receiving an invalid ASCII-Value that's a valid field-value in HTTP, but the precise behavior is not strictly defined: they may throw the value away or accept the value. If accepted, care must be taken to make sure that the application is permitted to echo the value back as metadata. For example, if the metadata is provided to the application as a list in a request, the application should not trigger an error by providing that same list as the metadata in the response.

This means that on the receiving side, we also need to drop these headers, as we have no way to distinguish between metadata received over the wire and metadata created by the application.

Apr 27 '22 20:04 dfawley

Any update on this?

Jul 04 '22 13:07 win5do

Yeah, I just got burned by this too. Whether the original behavior was a bug or not, introducing a breaking change like this is not a small thing.

(If we're going to start silently dropping the headers on the server-side too, I'm concerned that will also lead to more confusion.)

Sep 20 '22 16:09 jonathanrhodes

I'm sorry. Although I implement this validating, I getting burned by this too now. We didn't anticipate unicode is being widely used on header value. Should do we not allow %x80-FF? Because the http RFC7230 allow it and many users pass unicode value. May be we need to modify grpc rfc? And I agree with @jonathanrhodes. If we not allow %x80-FF, dropping invalid header on the receiving side. This covert break change is confusing.

Dec 02 '22 03:12 Patrick0308

i have the same problem, does not have a better solution

Apr 15 '23 04:04 chowyu12

any solution found besides downgrading?

Jul 07 '23 12:07 viggin543

It should be possible to metadata key names ending with -bin, and gRPC will base64 encode/decode them on the wire.

Jul 07 '23 14:07 dfawley

grpc-go grpc-go copied to clipboard

gRPC-Go should drop received headers containing upper ASCII (0x80+) characters

grpc-go
grpc-go copied to clipboard