spring-cloud-gateway icon indicating copy to clipboard operation
spring-cloud-gateway copied to clipboard

Double encoded URLs

Open maresja1 opened this issue 4 years ago • 30 comments

Describe the bug After migrating to cloud 2020.0.0-M6 from Hoxton.SR8 I started to have problems with double encoded URLs. It most probably only affects URL containing multiple = charactors in one query parameter. Problem most probably is that ServerWebExchangeUtils#containsEncodedParts reports false - taking URL as not encoded.

Sample

Exception is trapped in:

catch (IllegalArgumentException ignore) {
  if (log.isTraceEnabled()) {
    log.trace("Error in containsEncodedParts", ignore);
  }
}

java.lang.IllegalArgumentException: Invalid character '=' for QUERY_PARAM in "key==%22value%22"

For URL:

/filter?query=key=="value"&page=0&size=30

This "kind" of URL should be valid and is used by https://github.com/jirutka/rsql-parser or FIQL (https://tools.ietf.org/html/draft-nottingham-atompub-fiql-00#section-4)

Seems the validation is too strict. As it cannot be replaced with custom implementation, it would essentially mean rewriting the RouteToRequestUrlFilter filter (and potentially others, that use ServerWebExchangeUtils#containsEncodedParts).

maresja1 avatar Dec 06 '20 15:12 maresja1

@maresja1 can you post some code can reproduce this issue,i write a simple example it works well.

ctlove0523 avatar Dec 13 '20 10:12 ctlove0523

@ctlove0523
I had meet same problem, in my case the gateway incoming uri encoded as:

/get_rptlist?client=011&rptType=[%22PY%22]

the detail debug info show as belows invalid char

flyonce avatar Jan 07 '21 08:01 flyonce

@ctlove0523 I created PR #2137 containing a new unit test reproducing the error. Running ./mvnw test -pl spring-cloud-gateway-sample I now get the following failure:

[ERROR]   GatewaySampleApplicationTests.doubleEncoding:160->lambda$doubleEncoding$7:163 
Expecting:
 <"http://localhost:43675/httpbin/post?query=key%3D%3D%2527abc%2527">
to end with:
 <"/post?query=key==%27abc%27">

I now consider this error reproduced :slightly_smiling_face:

maresja1 avatar Feb 02 '21 23:02 maresja1

@maresja1 I run you test,it not passed but not SCG does,you can read this source code to find why:org.springframework.web.util.DefaultUriBuilderFactory and org.springframework.web.util.UriComponentsBuilder,method used to rebuild url is the follow method:

public UriComponents build() {
	return build(false);
}

ctlove0523 avatar Feb 03 '21 00:02 ctlove0523

@ctlove0523 Sorry, but I do not understand. What do you mean by

I run you test,it not passed but not SCG does,you can read this source code to find why

or by

I think this test can not certificate SCG double encode URL.UriComponentsBuilder not check URL has encode or not before decode URL"?

Are you trying to say, that this expected behaviour - thus not a bug? Or are you saying that this is a bug and suggesting way of fixing it? I do not need analysis of what happened, I already provided that in my first comment on this issue.

I was asked to provide reproducible example, so I provided you with the test. It fails - that is expected - because there is a bug in SGC. Is anyone going to look at this bug? Thank you.

maresja1 avatar Feb 03 '21 08:02 maresja1

@maresja1 Client send request to SCG,SCG then forward request to proxy service,i mean client encode the encoded URL not SCG.

ctlove0523 avatar Feb 03 '21 08:02 ctlove0523

@ctlove0523 Client sends the URL encoded, but SGC "thinks" that it is not encoded, so it encodes it once more, that is the problem. If you decode:

http://localhost:43675/httpbin/post?query=key%3D%3D%2527abc%2527

you get

http://localhost:43675/httpbin/post?query=key==%27abc%27,

but if you decode

http://localhost:43675/httpbin/post?query=key==%27abc%27

you get

http://localhost:43675/httpbin/post?query=key=='abc',

Which is a big difference. Do you understand?

maresja1 avatar Feb 03 '21 08:02 maresja1

@spencergibb I apologize for approaching you directly, but is there a chance, someone else could look at this? I am confident this is a bug (please see the unit test I provided) and I think there will be more users affected by it. As SGC should be a generic tool, it shouldn't restrict the form of URLs more than the HTTP specification does.

I considered creating PR that fixes the issue, but I think it needs deeper analysis and considerations.

We really like SGC and we would like to upgrade to the newer version, but this prevents us from doing so. Thank you in advance.

maresja1 avatar Feb 04 '21 13:02 maresja1

We have encountered the same issue. I thought I would share full details, our analysis, and how we have for now worked around the problem, in case it is useful either to Spring developers working on a fix, or anyone in the community who comes across the same issue and wants to be unblocked.

Context

  • We have a number of Spring-based microservices which call each other using Feign.
  • In a number of cases, we have an internal POJO representing the parameters we wish to use on a GET (let's call this SearchRequest); we serialise this object into query parameters via a custom implementation of the Feign Encoder class - i.e. this interface https://github.com/OpenFeign/feign/blob/master/core/src/main/java/feign/codec/Encoder.java#L79.
  • In our custom encoder, we:
    • check for the expected object type
    • use reflection to get all fields in the POJO
    • populate a Map<String, Collection<String>> queries variable with the data from all fields
    • call queries.putAll(template.queries()) to make sure any query params defined otherwise are retained
    • call template.queries(queries) to put those query params onto the RequestTemplate.
  • Under the covers, this ends up (see https://github.com/OpenFeign/feign/blob/master/core/src/main/java/feign/template/Template.java#L147-L155) calling UriUtils.encode(value, this.charset, true), which performs a partial URI encoding.
    • Taking, for example, a field on SearchRequest (let's call it query) containing the following String value: field=("value")
    • the partial URI encoding performed here acts on the quotes but not the equals or bracket symbols, turning this into field=(%22value%22).

Before using Spring-Cloud loadbalancer

  • Ribbon loadbalancer is in use
  • After service discovery resolution, the URI requested over the wire would be like http://resolved-host:port/api/search?query=field=(%22value%22).
  • The receiving service decodes this just fine and performs the search.

With Spring-Cloud loadbalancer

To disable Ribbon, and use the Spring-Cloud loadbalancer instead, we set the following in our bootstrap.yml

spring:
  cloud:
    loadbalancer:
      ribbon:
        enabled: false

Then...

  • https://github.com/spring-cloud/spring-cloud-openfeign/blob/main/spring-cloud-openfeign-core/src/main/java/org/springframework/cloud/openfeign/loadbalancer/FeignBlockingLoadBalancerClient.java#L102 calls String reconstructedUrl = loadBalancerClient.reconstructURI(instance, originalUri).toString();
  • In turn (see https://github.com/spring-cloud/spring-cloud-commons/blob/main/spring-cloud-loadbalancer/src/main/java/org/springframework/cloud/loadbalancer/blocking/client/BlockingLoadBalancerClient.java#L134-L137) this calls LoadBalancerUriTools.reconstructURI(serviceInstance, original);
  • In order to decide whether the URI should be encoded (see https://github.com/spring-cloud/spring-cloud-commons/blob/main/spring-cloud-commons/src/main/java/org/springframework/cloud/client/loadbalancer/LoadBalancerUriTools.java#L59-L68) there's the following insightful comment: // Verify if it is really fully encoded. Treat partial encoded as unencoded.
  • As a result, the partially-encoded URI gets encoded a second time.
  • The URI requested over the wire becomes like http://resolved-host:port/api/search?query=field%3D%28%2522value%2522%29
  • The receiving service decodes this and ends up with field=(%22value%22). From its perspective, this is invalid query syntax, so 💥

Our workaround

  • We have been able to work around this by modifying our custom encoder to call UriUtils.encode(input, StandardCharsets.UTF_8) on all field values before invoking template.queries(queries).
  • In the above example, this makes the query param like field%3D%28%22value%22%29.
  • Since this is already fully encoded, both the Feign encoding operation and the Spring-Cloud LB encoding operation make no changes.

I hope this information is useful to others. Even if you're not using Feign, perhaps the hint of "make sure things are fully, not partially encoded" might be helpful.

oliverlockwood avatar May 26 '21 09:05 oliverlockwood

@oliverlockwood I am really interested in a workaround, but I cannot change the client nor the server. From your comment:

We have been able to work around this by modifying our custom encoder to call UriUtils.encode(input, StandardCharsets.UTF_8) on all field values before invoking template.queries(queries)

It is not quite clear, what exactly have you changed. Is it a change only on the side of the gateway? If so, can you provide a bit more detail?

maresja1 avatar May 27 '21 10:05 maresja1

@maresja1 What I have changed is our custom encoder, which as mentioned is an implementation of https://github.com/OpenFeign/feign/blob/master/core/src/main/java/feign/codec/Encoder.java#L79.

This is a bean used on the client side (we're using Feign as our client library). In this, where we determine the query params to dynamically add to the RequestTemplate, we're now URI-encoding them in our own code, rather than letting Feign partially-encode them.

As a result, when (again on the client side) we hit the spring-cloud LB (BlockingLoadBalancerClient) the code there sees that our URI is fully encoded and therefore doesn't encode it a second time.

The bug as I see it only occurs when spring-cloud LB considers a URI to partially encoded, and is not satisfied by that.

So my workaround requires the ability to change the client to force URIs to be fully encoded before we hit the spring-cloud LB layer.

Hope that clarifies things?

oliverlockwood avatar May 27 '21 10:05 oliverlockwood

I just ran in to this. It's ServerWebExchangeUtils.containsEncodedParts()

This is the PR this bug was introduced in: https://github.com/spring-cloud/spring-cloud-gateway/pull/467/files#diff-2d3161c1700281bec196dd652aa6f656d6b17aae078f5d4bdc146ba2c48f609cR91

Current version of it is this: https://github.com/spring-cloud/spring-cloud-gateway/blob/main/spring-cloud-gateway-server/src/main/java/org/springframework/cloud/gateway/support/ServerWebExchangeUtils.java#L233

Notice the exception is being swallowed here, which in my opinion is very dangerous.

This is what was happening in my case:

incoming URL: ?filter[something]&apostrophe%27 -- This results in the following steps:

  1. boolean encoded in the method is true because it contains a %
  2. UriComponentsBuilder errors out in the next lines because of
java.lang.IllegalArgumentException: Invalid character '[' for QUERY_PARAM in "filter[something]"
  1. Because this exception is being ignored, and only logged out to TRACE, the method now flags this as false instead of being partially encoded to true
  2. Back in RouteToRequestUrlFilter.filter( ... ) it will now continue to try to encode the URL

The result is now ?filter[something]&apostrophe%2527 where the %2527 part has been double encoded. Essentially it added a %25 for the % character.

@maresja1 Not sure if you're looking for a work around, but this is essentially what I did for now until this gets fixed:

public class GlobalUriFilter implements GlobalFilter, Ordered {
  @Override
  public Mono<Void> filter(ServerWebExchange exchange, GatewayFilterChain chain) {
    URI incomingUri = exchange.getRequest().getURI();
    if (isUriEncoded(incomingUri)) {
      // Get the original Gateway route (contains the service's original host)
      Route route = exchange.getAttribute(GATEWAY_ROUTE_ATTR);
      if (route == null) {
        return chain.filter(exchange);
      }

      // Use the original incomingUri path and query params
      URI mergedUri = URI.create(
          route.getUri() + incomingUri.getPath() + "?" + incomingUri.getRawQuery()
      );

      // Save it as the outgoing URI to call the service, and override the "wrongly" double encoded URI
      exchange.getAttributes().put(GATEWAY_REQUEST_URL_ATTR, mergedUri);
    }

    return chain.filter(exchange);
  }

  private static boolean isUriEncoded(URI uri) {
    // Implement your own logic... MIne was specifically for not encoded brackets.
    // Or you might not even need this method in your case
    return true;
  }

  @Override
  public int getOrder() {
    return RouteToRequestUrlFilter.ROUTE_TO_URL_FILTER_ORDER + 1;
  }
}

franzvezuli avatar Jan 12 '22 17:01 franzvezuli

@franzvezuli I made very similar observations, thanks for providing more detailed description.

Also thank you for sharing the stub for the workaround. That year ago, when I analyzed it, I wouldn't have hoped that additional filter can fix it in such a simple way. I was convinced UriComponentsBuilder or containsEncodedParts would be used in other filters, that would make the request fail somewhere else.

Your suggestion motivated me to look at it again and it seems to work.

For others, this is the stub with simple isUriEncoded method implemented. It is essentially the original containsEncodedParts without the try-catch calling UriComponentsBuilder. IMHO the comment in containsEncodedParts:

// Verify if it is really fully encoded. Treat partial encoded as unencoded.

does not make any sense anyway - even in case of partially encoded URL, the gateway should not try to encode it again. If it would, the already encoded parts will be encoded twice and decoding will produce different result. Whereas when it doesn't touch the URL, the final service might still be able to understand the request.

import org.springframework.cloud.gateway.filter.GatewayFilterChain;
import org.springframework.cloud.gateway.filter.GlobalFilter;
import org.springframework.cloud.gateway.filter.RouteToRequestUrlFilter;
import org.springframework.cloud.gateway.route.Route;
import org.springframework.core.Ordered;
import org.springframework.stereotype.Component;
import org.springframework.web.server.ServerWebExchange;
import reactor.core.publisher.Mono;

import java.net.URI;

import static org.springframework.cloud.gateway.support.ServerWebExchangeUtils.GATEWAY_REQUEST_URL_ATTR;
import static org.springframework.cloud.gateway.support.ServerWebExchangeUtils.GATEWAY_ROUTE_ATTR;

@Component
public class GlobalUriFilter implements GlobalFilter, Ordered {

	@Override
	public Mono<Void> filter(ServerWebExchange exchange, GatewayFilterChain chain) {
		URI incomingUri = exchange.getRequest().getURI();
		if (isUriEncoded(incomingUri)) {
			// Get the original Gateway route (contains the service's original host)
			Route route = exchange.getAttribute(GATEWAY_ROUTE_ATTR);
			if (route == null) {
				return chain.filter(exchange);
			}

			// Use the original incomingUri path and query params
			final var routeUri = route.getUri();
			URI mergedUri = createUri(incomingUri, routeUri);

			// Save it as the outgoing URI to call the service, and override the "wrongly" double encoded URI
			exchange.getAttributes().put(GATEWAY_REQUEST_URL_ATTR, mergedUri);
		}

		return chain.filter(exchange);
	}

	private URI createUri(URI incomingUri, URI routeUri) {
		final var port = routeUri.getPort() != -1 ? ":" + routeUri.getPort() : "";
		final var rawPath = incomingUri.getRawPath() != null ? incomingUri.getRawPath() : "";
		final var query = incomingUri.getRawQuery() != null ?  "?" + incomingUri.getRawQuery() : "";
		return URI.create(routeUri.getScheme() + "://" + routeUri.getHost() + port + rawPath + query);
	}

	private static boolean isUriEncoded(URI uri) {
		return (uri.getRawQuery() != null && uri.getRawQuery().contains("%"))
			|| (uri.getRawPath() != null && uri.getRawPath().contains("%"));
	}

	@Override
	public int getOrder() {
		return RouteToRequestUrlFilter.ROUTE_TO_URL_FILTER_ORDER + 1;
	}
}

maresja1 avatar Jan 15 '22 00:01 maresja1

@maresja1 Glad you were able to find it useful. And yes I agree, the original implementation of containsEncodedParts (without the try-catch second phase) to figure out if a URI is already encoded was the better solution IMO.

SCG should not care if a URL is fully or partially encoded. As long as there is some encoding done, it shouldn't encode it twice is the main issue.

franzvezuli avatar Jan 15 '22 01:01 franzvezuli

spring-cloud-gateway Version:3.0.6 http://127.0.0.1/xxxx/xxxx/xxxx?dateRange[]=2022-05-18

import lombok.var;
import org.springframework.cloud.gateway.filter.GatewayFilterChain;
import org.springframework.cloud.gateway.filter.GlobalFilter;
import org.springframework.cloud.gateway.filter.ReactiveLoadBalancerClientFilter;
import org.springframework.cloud.gateway.route.Route;
import org.springframework.core.Ordered;
import org.springframework.stereotype.Component;
import org.springframework.web.server.ServerWebExchange;
import reactor.core.publisher.Mono;

import java.net.URI;

import static org.springframework.cloud.gateway.support.ServerWebExchangeUtils.GATEWAY_REQUEST_URL_ATTR;
import static org.springframework.cloud.gateway.support.ServerWebExchangeUtils.GATEWAY_ROUTE_ATTR;

@Component
public class GlobalUriFilter implements GlobalFilter, Ordered {

	@Override
	public Mono<Void> filter(ServerWebExchange exchange, GatewayFilterChain chain) {
		URI incomingUri = exchange.getRequest().getURI();
		if (isUriEncoded(incomingUri)) {
			// Get the original Gateway route (contains the service's original host)
			Route route = exchange.getAttribute(GATEWAY_ROUTE_ATTR);
			if (route == null) {
				return chain.filter(exchange);
			}
			// Save it as the outgoing URI to call the service, and override the "wrongly" double encoded URI
		// in  ReactiveLoadBalancerClientFilter LoadBalancerUriTools::containsEncodedParts double encoded URI again
			URI balanceUrl = exchange.getRequiredAttribute(GATEWAY_REQUEST_URL_ATTR);
			URI mergedUri = createUri(incomingUri, balanceUrl);
			exchange.getAttributes().put(GATEWAY_REQUEST_URL_ATTR, mergedUri);
		}

		return chain.filter(exchange);
	}

	private URI createUri(URI incomingUri, URI balanceUrl) {
		final var port = balanceUrl.getPort() != -1 ? ":" + balanceUrl.getPort() : "";
		final var rawPath = balanceUrl.getRawPath() != null ? balanceUrl.getRawPath() : "";
		final var query = incomingUri.getRawQuery() != null ?  "?" + incomingUri.getRawQuery() : "";
		return URI.create(balanceUrl.getScheme() + "://" + balanceUrl.getHost() + port + rawPath + query);
	}

	private static boolean isUriEncoded(URI uri) {
		return (uri.getRawQuery() != null && uri.getRawQuery().contains("%"))
				|| (uri.getRawPath() != null && uri.getRawPath().contains("%"));
	}
	
	//order after ReactiveLoadBalancerClientFilter
	@Override
	public int getOrder() {
		return ReactiveLoadBalancerClientFilter.LOAD_BALANCER_CLIENT_FILTER_ORDER + 1;
	}
}

jinxiaoyi avatar May 24 '22 09:05 jinxiaoyi

Same issue here. Any chance to get a proper fix?

@franzvezuli thank you for sharing your workaround

davebaol avatar Jan 09 '23 18:01 davebaol

I've got the same issue, found the root cause is the different behaviour for new URI() and URI.create(). Could you please fix this issue, Spring Cloud Gateway is a common tool used by other services, it's common to pass in encoded URI, it's very annoying the encoded URI is encoded again.

kyle-wang-sd avatar Mar 06 '23 10:03 kyle-wang-sd

For what it is worth, there seems to be some disagreement on whether [ and ] should be encoded. While not reserved it appears to be a good practice to encode them. So, URI in Java doesn't throw an Exception for those characters, but URLEncoder.encode will encode them. A similar problem occurs in C#, but is a bit more hidden since C# new URI(string) will encode the reserved characters for you rather than throwing an exception. Similarly, .NET HttpUtility.UrlEncode(string) will encode [ and ]. I guess, it is the ambiguity between truly reserved characters and and characters that are risky that seems to result in clients triggering this bug.

Edit: After poking around more. org.springframework.web.utils.HierarchicalURIComponnents references RFC3986, which appears to say [ and ] are not allowed in the query part of a URI. So, the question at hand is how should Spring Cloud Gateway deal with URIs whose syntax doesn't conform. I agree with others that quietly trying to "fix" them is probably not right.

donrw avatar Mar 22 '23 00:03 donrw

Same issue here with mvc gateway. Is there a workarround for mvc gateway ?

dutrieux avatar Feb 09 '24 09:02 dutrieux

你好,你的邮件我已经收到,每天会在晚上查看邮件并回复。急事请打电话

ctlove0523 avatar Feb 09 '24 09:02 ctlove0523

We encountered a similar problem with RewriteRequestParameterGatewayFilterFactory for which I submitted a bug for spring-web: https://github.com/spring-projects/spring-framework/issues/32234

bahrigencsoy avatar Feb 09 '24 16:02 bahrigencsoy

UriComponentsBuilder (and HierarchicalUriComponents) doesn't allow = for either keys or values. Whether it aligns with the IETF guidelines or not (I'm not sure yet), I don't see how it has anything to do with Spring Cloud Gateway since it's a shared web class

ServerWebExchangeUtils (which does belong to Gateway) could opt not to use UriComponentsBuilder, but I believe decisions on what constitutes a valid URI component should not be made on an ad-hoc, project-by-project basis

// org.springframework.web.util.HierarchicalUriComponents.verify()

  this.queryParams.forEach((key, values) -> {
	  verifyUriComponent(key, Type.QUERY_PARAM);
	  for (String value : values) {
		  verifyUriComponent(value, Type.QUERY_PARAM);
	  }
  });
// org.springframework.web.util.HierarchicalUriComponents.Type

QUERY_PARAM {
	@Override
	public boolean isAllowed(int c) {
		if ('=' == c || '&' == c) {
			return false;
		}
		else {
			return isPchar(c) || '/' == c || '?' == c;
		}
	}
}

reserved = gen-delims / sub-delims

gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@"

sub-delims = "!" / "$" / "&" / "'" / "(" / ")"

/ "*" / "+" / "," / ";" / "="

(RFC 3986)

NadChel avatar Mar 24 '24 07:03 NadChel