dialogue icon indicating copy to clipboard operation
dialogue copied to clipboard

[Draft] Include ':' and '@' in pchar definition for url encoding

Open fawind opened this issue 8 months ago • 3 comments

Before this PR

Putting this up for potential discussion - not sure if we actually want to make this change.

Context: We ran into the case where requests of a dialogue client get rejected by google-container-registry because dialogue would url encode the colon : in path segments (e.g. sha256:c48bxxx) while GCR only accepts non-encoded : in path segments.

Looking into Dialogue's url encoding, I noticed that Dialogue's implementation doesn't fully match the referenced RFC-3986. Most notably, Dialogue is defining the pchar matcher as pchar = unreserved, while the RFC is a bit more permissive here and also includes sub-delims, :, and @:

pchar         = unreserved / pct-encoded / sub-delims / ":" / "@"

Note that we have another explicit divergence for query params but this one is well documented and for compatibility reasons:

https://github.com/palantir/dialogue/blob/05ea07174aa8d9ea77cd09585f0852ea3554b354/dialogue-core/src/main/java/com/palantir/dialogue/core/BaseUrl.java#L247-L251

Unclear points:

  • In the RFC, pchar also includes sub-delims. But given the comment above, it seems like we want to purposfully encode sub-delims?
  • This logic has been around since early 2019 (PR). Given this never came up as an issue, maybe we don't feel like its worth touching this code?
  • This is just a spec, and it's hard for me to judge the impact of such a change across all the consumers.

After this PR

Extend the pchar matcher to also include : and @. This will result in those characters no longer being url encoded in path segments.

==COMMIT_MSG== Include ':' and '@' in pchar definition for url encoding ==COMMIT_MSG==

Possible downsides?

fawind avatar Jun 10 '24 13:06 fawind