spring-cloud-gateway icon indicating copy to clipboard operation
spring-cloud-gateway copied to clipboard

Use getPath instead of getRawPath to prevent doulbe encoding of the URI

Open jensmatw opened this issue 1 year ago • 5 comments

Use getPath instead of getRawPath to prevent doulbe encoding of the URI to fix issue #3482.

jensmatw avatar Jul 30 '24 12:07 jensmatw

See https://github.com/spring-cloud/spring-cloud-gateway/pull/3437

rworsnop avatar Jul 30 '24 21:07 rworsnop

I'm not sure this is going to work. Consider the url "http://example.com/foo–bar foo" (that's an en dash, not a dash):

 String url = "http://example.com/foo%E2%80%93bar%20foo";
 URI uri = URI.create(url);
 String path = uri.getPath();
 URI uri2 = UriComponentsBuilder.fromUri(uri).replacePath(path).build().toUri(); 

uri2 is "http://example.com/foo–bar%20foo", so the space got re-encoded properly, but the en dash did not.

rworsnop avatar Sep 25 '24 17:09 rworsnop

I'm not sure this is going to work. Consider the url "http://example.com/foo–bar foo" (that's an en dash, not a dash):

You are right, it only solves the issue for ascii characters because it seems UriComponentsBuilder does not encode anything else. I'm not sure if there is an easy solution. I have seen your PR (#3437), which would encode space and the en dash correctly. But for rewrite path it would do the regex on the encoded url which i guess no one expects when writing a rewrite rule.

@rworsnop @spencergibb Do you have an idea how we could support both, rewrite (applying regex) on the correct path and still maintain support for non-ascii?

jensmatw avatar Sep 27 '24 16:09 jensmatw

@rworsnop it seems that neither jdks URI nor the UriComponentsBuilder from Spring escapes unicodes outside of ascii.

jshell> new URI("http", "localhost", "/te-st test", null)
$32 ==> http://localhost/te-st%20test
UriComponentsBuilder.fromUri(request.uri()).replacePath("/te-st test").build().toUri();
http://localhost:8080/te-st%20test

Yet, the code from PR is still working for unicode characters, because it seems the http-client or something after the filter is escaping the en-dash correctly. But I found another issue: While the URI from componentsbuilder with te-st is sending te%E2%80%93st and te set is sending te%20% there is an issue if you have both! If we leave the filter with the URI te-st test it gets send to the downstream server as te%E2%80%93st%2520test, so it is double-encoded again. Something in the spring chain is "detecting" the unicode character and decides to encode everything again. Maybe it is also somewhere in the JDK.

jensmatw avatar Sep 30 '24 13:09 jensmatw

I've found a solution, it seems .encode() from the UriComponentsBuilder does encode everything correctly. It does not double encode the space. It still encodes the unicodes (see tests).

@spencergibb tests have also been added

jensmatw avatar Sep 30 '24 13:09 jensmatw

@spencergibb Would it be possible to get either this fix or #3437 included in the next release? We are also encountering this issue with spaces (and other special characters) encoding twice.

sshefferly avatar Dec 04 '24 18:12 sshefferly

related https://github.com/spring-cloud/spring-cloud-gateway/issues/3657#issue-2767306840

raccoonback avatar Jan 05 '25 01:01 raccoonback

I'm going to actively look at this class of problem in the next few weeks.

spencergibb avatar Jan 24 '25 01:01 spencergibb

This is also addressed in https://github.com/spring-cloud/spring-cloud-gateway/pull/3658 which appears to address another double encoding issue, so I am closing this PR.

ryanjbaxter avatar Mar 13 '25 20:03 ryanjbaxter

@ryanjbaxter there's an important difference that is only important for the rewritePath filter: The regex is applied on the path and I think everyone would assume, that the path is not escaped when writing a regex. Thats why I used getPath, so the method uses internally the unescaped string to apply the regex.

Should i try to add the code from here into #3658? Or should leave out rewritePath from #3658 and use this PR only for this filter?

Alternatively, I think my approach here would also work for all other filters, so we could also add the query-escaping code form #3658 here, but I haven't tested.

jensmatw avatar Mar 14 '25 09:03 jensmatw

Now, #3658 has been merged without adding any fix for rewritePath (or any other filter except setPath and stripPrefix).

Can we reopen this?

jensmatw avatar Mar 14 '25 15:03 jensmatw

Yes my bad, sorry about that!

ryanjbaxter avatar Mar 14 '25 18:03 ryanjbaxter

@jensmatw Could you make the PR against the 4.1.x branch and resolve the merge conflicts?

ryanjbaxter avatar Mar 14 '25 18:03 ryanjbaxter

@jensmatw can you sign your commits so the DCO check passes?

ryanjbaxter avatar Mar 17 '25 19:03 ryanjbaxter