grpc-go
grpc-go copied to clipboard
gRPC-Go should drop received headers containing upper ASCII (0x80+) characters
A have a question about a new version: v1.46.0 released on Apr 22. The release adds a client-side validation of HTTP-invalid metadata before attempting to send (PR #4886).
One of the validation rules states that the gRPC request header value must contain one or more characters from the set [%x20-%x7E] (ref code) effectively excluding non-printable characters.
As a result, in some cases, bumping the package from v1.45 to 1.46 may become a breaking change whenever header values pass on names and localized content (e.g. : Łukasz, Renée, Noël).
Was that an oversight or fully intentional to comply with RFC7230?
The change was intended to bring us in line with the HTTP spec, but I think there is a bug in the enforcement. RFC7230 says:
quoted-string = DQUOTE *( qdtext / quoted-pair ) DQUOTE
qdtext = HTAB / SP /%x21 / %x23-5B / %x5D-7E / obs-text
obs-text = %x80-FF
0x80-FF
seems to be there to allow for unicode, but we are rejecting it.
@dfawley Sounds reasonable to fix, I think. Is there anything I can help with to have that fixed?
I did a little more digging into this. It looks like the gRPC spec itself forbids this:
https://github.com/grpc/grpc/blob/master/doc/PROTOCOL-HTTP2.md#requests
Custom-Metadata → Binary-Header / ASCII-Header ASCII-Header → Header-Name ASCII-Value ASCII-Value → 1*( %x20-%x7E ) ; space and printable ASCII
Also, gRPC-Java for sure will have trouble with these values (I'm not sure about C++/other implementations). The only reason it worked for go->go servers and clients is that both implementations assumed UTF-8, which would not be the case in other languages.
For this reason, I believe it is a bug that we were allowing these to be used before, and that the validation, as added is mostly working as intended. I do apologize for the breakage, though, which was unexpected because we did not realize that the underlying HTTP framer implementation was accepting unicode values.
Instead of transmitting the strings directly, it is recommended to percent-encode them first (and decode them on the receiving side).
However, we do still have a bug. The spec goes on to say:
The ASCII-Value character range defined is more strict than HTTP. Implementations must not error due to receiving an invalid ASCII-Value that's a valid field-value in HTTP, but the precise behavior is not strictly defined: they may throw the value away or accept the value. If accepted, care must be taken to make sure that the application is permitted to echo the value back as metadata. For example, if the metadata is provided to the application as a list in a request, the application should not trigger an error by providing that same list as the metadata in the response.
This means that on the receiving side, we also need to drop these headers, as we have no way to distinguish between metadata received over the wire and metadata created by the application.
Any update on this?
Yeah, I just got burned by this too. Whether the original behavior was a bug or not, introducing a breaking change like this is not a small thing.
(If we're going to start silently dropping the headers on the server-side too, I'm concerned that will also lead to more confusion.)
I'm sorry. Although I implement this validating, I getting burned by this too now. We didn't anticipate unicode is being widely used on header value. Should do we not allow %x80-FF? Because the http RFC7230 allow it and many users pass unicode value. May be we need to modify grpc rfc? And I agree with @jonathanrhodes. If we not allow %x80-FF, dropping invalid header on the receiving side. This covert break change is confusing.
i have the same problem, does not have a better solution
any solution found besides downgrading?
It should be possible to metadata key names ending with -bin
, and gRPC will base64 encode/decode them on the wire.