spring-cloud-gateway icon indicating copy to clipboard operation
spring-cloud-gateway copied to clipboard

Cache body for form requests

Open SmithJosh opened this issue 4 years ago • 39 comments

Describe the bug I have a gateway application configured to route requests to some backend service. The app has a single page which sends POST requests to this service. It also has a web filter which tries to read form data from the requests.

The issue is, form POSTs hang and eventually timeout. It appears to be caused by the web filter reading the form data, as AJAX POST requests (using query params instead of form data) do work. See the sample.

Gateway version: 2.2.1.RELEASE

Sample (originally reported against Spring Security, hence the name, but this is reproducible with only the gateway) https://github.com/SmithJosh/spring-security-8026/

SmithJosh avatar Feb 26 '20 22:02 SmithJosh

See 58834daf6271c3ae379f5eb0939689ad6a14036e this is the same problem with the HiddenHttpMethodFilter from webflux.

The framework team decided to not do anything about it. See https://github.com/spring-projects/spring-framework/issues/21824.

Gateway disables the HiddenHttpMethodFilter in anger. Not sure what to do about this.

Once getFormData() is used, the original raw content can no longer be read from the request body. For this reason, applications are expected to go through ServerWebExchange consistently for access to the cached form data versus reading from the raw request body.

This means the data is no longer available to forward and the downstream service has a content length and just waits for the content.

spencergibb avatar Feb 28 '20 20:02 spencergibb

@rstoyanchev I know framework decided against caching, but is there something gateway could do to mitigate this?

spencergibb avatar Feb 28 '20 20:02 spencergibb

I suppose in the routing filter if it's a form data or multipart data (suffers same problem), gateway could use the writers to serialize back to DataBuffers

spencergibb avatar Feb 28 '20 20:02 spencergibb

We are caching through ServerWebExchange#getFormData() and yes gateway could serialize form data but since reading the body, even indirectly via getFormData(), competes with Gateway trying to also read in order to forward, is this supported? Or rather what is the official way to access the content of a form request?

Is it through ServerWebExchange in which case, yes you'll need to serialize form data. Or through some other way like gateway filter or predicate? I imagine gateway needs to be aware one way or another that the application is accessing form data, or it'll have to read and serialize form data every time whether that's needed or not.

rstoyanchev avatar Mar 02 '20 09:03 rstoyanchev

We are caching through ServerWebExchange#getFormData() and yes gateway could serialize form data but since reading the body, even indirectly via getFormData(), competes with Gateway trying to also read in order to forward, is this supported?

In my opinion, this should be supported. There are legitimate use cases requiring reading form data at the gateway. Checking for CSRF tokens is my use case, but HiddenHttpMethodFilter needs to read it too, and in general it seems like a reasonable thing to support at the gateway.

I see Spring has the ContentCachingRequestWrapper supporting a multi-read use case for the Servlet API. Is there something similar for a ServerHttpRequest?

SmithJosh avatar Mar 02 '20 19:03 SmithJosh

We do have caching for the request, but it only starts when something matches a gateway route, after a normal WebFliter may have already accessed the form data and only in certain cases (reading the body or retrying with a body).

spencergibb avatar Mar 02 '20 20:03 spencergibb

I imagine gateway needs to be aware one way or another that the application is accessing form data

How would it know that? This case is shown in a WebFilter before reaching the gateway entry point.

spencergibb avatar Mar 02 '20 20:03 spencergibb

Maybe we could override HttpWebHandlerAdapter.createExchange() and extend DefaultServerWebExchange so we can mark when getFormData() or getMultipartData() are called?

spencergibb avatar Mar 02 '20 20:03 spencergibb

In my opinion, this should be supported

Yes, question is how.

HiddenHttpMethodFilter needs to read it

HiddenHttpMethodFilter has no role to play and should never be needed on a request forwarded by the gateway.

We do have caching for the request, but it only starts when something matches a gateway route

Would it make sense to enhance this to provide access to form data somehow? So that it's done in the context of something that gateway handles and understands.

Or perhaps there could be a gateway-provided base WebFilter for access to form data?

rstoyanchev avatar Mar 02 '20 21:03 rstoyanchev

Or perhaps there could be a gateway-provided base WebFilter for access to form data?

Something needs to be provided for normal WebFilters. Should it be transparent (custom ServerWebExchange, etc...) or explicit. The latter will be simpler.

spencergibb avatar Mar 02 '20 21:03 spencergibb

Would that also include the ability to read the body, irrespective of form data?

rstoyanchev avatar Mar 02 '20 21:03 rstoyanchev

It certainly could. We have the framework for that already for both serialized body (predicates) and raw body (retry filters).

spencergibb avatar Mar 02 '20 21:03 spencergibb

So then perhaps wrapping the request to intercept getBody and essentially cache it if accessed might be something to try.

rstoyanchev avatar Mar 02 '20 21:03 rstoyanchev

@spencergibb Any update on this?

SmithJosh avatar Apr 06 '20 16:04 SmithJosh

no

spencergibb avatar Apr 06 '20 16:04 spencergibb

@spencergibb

Is this issue still being discussed in the team? We are experiencing the same problem when having an authorization server behind a Spring Cloud Gateway server. If CSRF protection is enabled, the submit of the login form hangs and eventually results in a timeout. Possible workarounds are disabling CSRF for the login which is most probably not a good idea or updating the login screen to use an ajax request instead of a standard form submit.

dsteegen avatar Feb 08 '21 11:02 dsteegen

@SmithJosh FYI: I was able to fix this by creating a filter that caches the incoming request in case it contains form data.

@Component
@RequiredArgsConstructor
public class CacheFormDataFilter implements WebFilter {

    private final ServerCodecConfigurer codecConfigurer;
    private final WebSessionManager sessionManager;
    private final LocaleContextResolver localeContextResolver;

    @Override
    public Mono<Void> filter(ServerWebExchange serverWebExchange, WebFilterChain chain) {
        if (requestContainsFormData(serverWebExchange)) {
            return cacheRequestBody(serverWebExchange, (cachedRequest) -> {
                var exchange = new DefaultServerWebExchange(cachedRequest, serverWebExchange.getResponse(), sessionManager, codecConfigurer, localeContextResolver);
                return chain.filter(exchange);
            });
        }

        return chain.filter(serverWebExchange);
    }

    private boolean requestContainsFormData(ServerWebExchange serverWebExchange) {
        var contentType = serverWebExchange.getRequest().getHeaders().getContentType();
        return APPLICATION_FORM_URLENCODED.isCompatibleWith(contentType);
    }
}

Then I configure this filter to be applied just before the Spring Security CSRF WebFilter

@Bean
    public SecurityWebFilterChain springSecurityFilterChain(ServerHttpSecurity http, CacheFormDataFilter cacheFormDataFilter) {
        return http
               ....
                .addFilterBefore(cacheFormDataFilter, CSRF)
               .... 
                .build();
    }

So whenever the getFormData() method of DefaultServerWebExchange is subscribed, it will make use of a cached ServerHttpRequest to parse the body to a MultiValueMap in the DefaultServerWebExchange class.

dsteegen avatar Feb 19 '21 14:02 dsteegen

I think we are seeing direct memory leak when using this workaround (and eventually throwing out of direct memory). After removing our filter for caching request body the direct memory stops leaking.

tuomoa avatar Jun 16 '21 06:06 tuomoa

It looks like ServerWebExchangeUtils.cacheRequestBody is creating composite DataBuffer which seems to be not released correctly. I guess netty will release the original buffer when connection is closed but this composite buffer is not released.

I think spring should handle this (in the cache util maybe?), but as a workaround we could release the cached composite buffer manually when the mono is completed by adding this kind of transformDeferred operation:

    @Override
    public Mono<Void> filter(ServerWebExchange serverWebExchange, WebFilterChain chain) {
        if (requestContainsFormData(serverWebExchange)) {
            return ServerWebExchangeUtils.cacheRequestBody(serverWebExchange, (cachedRequest) -> {
                var exchange = new DefaultServerWebExchange(cachedRequest, serverWebExchange.getResponse(), sessionManager, codecConfigurer, localeContextResolver);
                return chain.filter(exchange);
            }).transformDeferred(call -> {
                return call.doFinally(t -> {
                    DataBuffer cachedBody = serverWebExchange.getAttributeOrDefault(CACHED_REQUEST_BODY_ATTR, null);
                    if (cachedBody != null) {
                        DataBufferUtils.release(cachedBody);
                    }
                });
            });
        }

        return chain.filter(serverWebExchange);
    }

There is RemoveCachedBodyFilter for spring-cloud-gateway (it's GlobalFilter) but this workaround is a WebFilter (and we want to run csrf check before gateway chain..) Also the javadoc anywhere did not really tell you to make sure the cached buffer is released somewhere. So if there is people using this workaround make sure you are handling releasing the buffer somehow. :)

tuomoa avatar Jun 16 '21 08:06 tuomoa

@tuomoa I used your workaround, but the memory still leaks.

FrankChen021 avatar Oct 14 '21 12:10 FrankChen021

@tuomoa I used your workaround, but the memory still leaks.

@FrankChen021 Interesting. We are not seeing memory leak after doing pretty much the fix proposed above. Have you been able to figure out what is happening? Are you sure that your Mono is completed eventually and that doFinally is called?

tuomoa avatar Nov 02 '21 11:11 tuomoa

So, How to repeatedly read FormData?

lsh1358046425 avatar Jun 09 '22 05:06 lsh1358046425

The problem still exists in the latest version, is there a solution? @spencergibb @rstoyanchev

tianshuang avatar Jun 28 '22 07:06 tianshuang

@tuomoa In my case, your method also memory leak. What is your filter order?

lsh1358046425 avatar Jun 30 '22 08:06 lsh1358046425

@tuomoa In my case, your method also memory leak. What is your filter order?

@FrankChen021 @tianshuang @lsh1358046425 Our WebFilter has @Order(Ordered.HIGHEST_PRECEDENCE + 2)

It's been quite a while when we did create the workaround, but here is our complete filter code if it helps. Looks like we did end up having a custom decorator for the cachedRequest instead of using the DefaultServerWebExchange. Do note that we only cache when request contains formdata. Your use case might be different.

package demo;

import org.springframework.cloud.gateway.support.ServerWebExchangeUtils;
import org.springframework.core.Ordered;
import org.springframework.core.ResolvableType;
import org.springframework.core.annotation.Order;
import org.springframework.core.codec.Hints;
import org.springframework.core.io.buffer.DataBuffer;
import org.springframework.core.io.buffer.DataBufferUtils;
import org.springframework.http.InvalidMediaTypeException;
import org.springframework.http.MediaType;
import org.springframework.http.codec.HttpMessageReader;
import org.springframework.http.codec.ServerCodecConfigurer;
import org.springframework.util.CollectionUtils;
import org.springframework.util.LinkedMultiValueMap;
import org.springframework.util.MultiValueMap;
import org.springframework.web.server.ServerWebExchange;
import org.springframework.web.server.ServerWebExchangeDecorator;
import org.springframework.web.server.WebFilter;
import org.springframework.web.server.WebFilterChain;
import reactor.core.publisher.Mono;

import static org.springframework.cloud.gateway.support.ServerWebExchangeUtils.CACHED_REQUEST_BODY_ATTR;
import static org.springframework.http.MediaType.APPLICATION_FORM_URLENCODED;

/**
 * Decorate exchange and request to cache body form data.
 * Related to https://github.com/spring-cloud/spring-cloud-gateway/issues/1587
 *
 * Without this Csrf filter would read the body when reading formData and after that
 * when some other component tries to read the body it fails. With this, body is cached.
 *
 * Without decorating the server web exchange the formData reading accesses the original mono initialised
 * in {@link org.springframework.web.server.adapter.DefaultServerWebExchange} and would cause body to be
 * not cached. When decorator overrides it, read will actually cause the body cache to be populated.
 *
 * This filter needs to be before csrf filter in the chain.
 */
@Order(Ordered.HIGHEST_PRECEDENCE + 2)
public class RequestBodyCachingFilter implements WebFilter {

    private final ServerCodecConfigurer configurer;

    public RequestBodyCachingFilter(ServerCodecConfigurer configurer) {
        this.configurer = configurer;
    }

    @Override
    public Mono<Void> filter(ServerWebExchange exchange, WebFilterChain chain) {
        DataBuffer body = exchange.getAttributeOrDefault(CACHED_REQUEST_BODY_ATTR, null);

        if (body != null) {
            return chain.filter(exchange);
        }

        // Only cache if form data (because csrf inside only form data)
        if (requestContainsFormData(exchange)) {

            return ServerWebExchangeUtils.cacheRequestBody(exchange, (serverHttpRequest) -> {

                // don't mutate and build if same request object
                if (serverHttpRequest == exchange.getRequest()) {
                    return chain.filter(exchange);
                }

                return chain.filter(new FormDataServerWebExchangeDecorator(
                        exchange.mutate().request(serverHttpRequest).build(),
                        configurer));

            }).transformDeferred(call -> call.doFinally(t -> {
                /**
                 * See https://github.com/spring-cloud/spring-cloud-gateway/issues/1587
                 * Spring cacheRequestBody does not seem to release cached databuffer correctly and
                 * this leads to memory leak. Release the buffer manually after mono is completed.
                 */
                DataBuffer cachedBody = exchange.getAttributeOrDefault(CACHED_REQUEST_BODY_ATTR, null);
                if (cachedBody != null) {
                    DataBufferUtils.release(cachedBody);
                }
            }));
        }

        return chain.filter(exchange);
    }

    private static boolean requestContainsFormData(ServerWebExchange serverWebExchange) {
        var contentType = serverWebExchange.getRequest().getHeaders().getContentType();
        return APPLICATION_FORM_URLENCODED.isCompatibleWith(contentType);
    }

    private static class FormDataServerWebExchangeDecorator extends ServerWebExchangeDecorator {

        private static final Mono<MultiValueMap<String, String>> EMPTY_FORM_DATA =
                Mono.just(CollectionUtils.unmodifiableMultiValueMap(new LinkedMultiValueMap<String, String>(0))).cache();
        private static final ResolvableType FORM_DATA_TYPE =
                ResolvableType.forClassWithGenerics(MultiValueMap.class, String.class, String.class);

        private final ServerCodecConfigurer configurer;

        public FormDataServerWebExchangeDecorator(ServerWebExchange delegate, ServerCodecConfigurer configurer) {
            super(delegate);
            this.configurer = configurer;
        }

        @Override
        public Mono<MultiValueMap<String, String>> getFormData() {
            try {
                MediaType contentType = getDelegate().getRequest().getHeaders().getContentType();
                if (APPLICATION_FORM_URLENCODED.isCompatibleWith(contentType)) {
                    return ((HttpMessageReader<MultiValueMap<String, String>>) configurer.getReaders().stream()
                            .filter(reader -> reader.canRead(FORM_DATA_TYPE, APPLICATION_FORM_URLENCODED))
                            .findFirst()
                            .orElseThrow(() -> new IllegalStateException("No form data HttpMessageReader.")))
                            .readMono(FORM_DATA_TYPE, getDelegate().getRequest(), Hints.none())
                            .switchIfEmpty(EMPTY_FORM_DATA)
                            .cache();
                }
            } catch (InvalidMediaTypeException ex) {
                // Ignore
            }
            return EMPTY_FORM_DATA;
        }

    }

}

Edit: Now that I think about it, I think the problem with my initial fix in the earlier comment is that it creates a whole new ServerWebExchange so it might end up having the same attribute in the initial exchange and the exchange passed to the chain and only the initial exchange attribute is cleaned.

The filter we are using does not do this as it only mutates the initial exchange request. So we have to do
exchange.mutate().request(serverHttpRequest).build() instead of new DefaultWebServerExchange. But now we arrive to the point why we need the FormDataServerWebExchangeDecorator. The initial exchange that is passed to this filter already has formDataMono field (and some filter such as CsrfWebFilter might use it) which has reference to the original body and not the decorated cached body. BUT when we decorate the original exchange here and override the getFormData the subsequent calls will use the decorated cached body and it will work. Probably something like this is the reasoning behind our implementation.. it's been a while so I might not remember all the details... 😅

tuomoa avatar Jul 05 '22 09:07 tuomoa

@tuomoa @spencergibb The difference is that I cache body used to get MultipartData. I using your latest code, memory is still leaking. I still hope that the official can provide the method of getMultipartData.

lsh1358046425 avatar Jul 05 '22 12:07 lsh1358046425

@tuomoa @spencergibb The difference is that I cache body used to get MultipartData. I using your latest code, memory is still leaking. I still hope that the official can provide the method of getMultipartData.

Maybe you should override the getMultipartData in the decorator instead of getFormData because that has the same problem (original exchange has created multipartMono which references the original body.

tuomoa avatar Jul 05 '22 13:07 tuomoa

@tuomoa In my case, your method also memory leak. What is your filter order?

@FrankChen021 @tianshuang @lsh1358046425 Our WebFilter has @Order(Ordered.HIGHEST_PRECEDENCE + 2)

It's been quite a while when we did create the workaround, but here is our complete filter code if it helps. Looks like we did end up having a custom decorator for the cachedRequest instead of using the DefaultServerWebExchange. Do note that we only cache when request contains formdata. Your use case might be different.

package demo;

import org.springframework.cloud.gateway.support.ServerWebExchangeUtils;
import org.springframework.core.Ordered;
import org.springframework.core.ResolvableType;
import org.springframework.core.annotation.Order;
import org.springframework.core.codec.Hints;
import org.springframework.core.io.buffer.DataBuffer;
import org.springframework.core.io.buffer.DataBufferUtils;
import org.springframework.http.InvalidMediaTypeException;
import org.springframework.http.MediaType;
import org.springframework.http.codec.HttpMessageReader;
import org.springframework.http.codec.ServerCodecConfigurer;
import org.springframework.util.CollectionUtils;
import org.springframework.util.LinkedMultiValueMap;
import org.springframework.util.MultiValueMap;
import org.springframework.web.server.ServerWebExchange;
import org.springframework.web.server.ServerWebExchangeDecorator;
import org.springframework.web.server.WebFilter;
import org.springframework.web.server.WebFilterChain;
import reactor.core.publisher.Mono;

import static org.springframework.cloud.gateway.support.ServerWebExchangeUtils.CACHED_REQUEST_BODY_ATTR;
import static org.springframework.http.MediaType.APPLICATION_FORM_URLENCODED;

/**
 * Decorate exchange and request to cache body form data.
 * Related to https://github.com/spring-cloud/spring-cloud-gateway/issues/1587
 *
 * Without this Csrf filter would read the body when reading formData and after that
 * when some other component tries to read the body it fails. With this, body is cached.
 *
 * Without decorating the server web exchange the formData reading accesses the original mono initialised
 * in {@link org.springframework.web.server.adapter.DefaultServerWebExchange} and would cause body to be
 * not cached. When decorator overrides it, read will actually cause the body cache to be populated.
 *
 * This filter needs to be before csrf filter in the chain.
 */
@Order(Ordered.HIGHEST_PRECEDENCE + 2)
public class RequestBodyCachingFilter implements WebFilter {

    private final ServerCodecConfigurer configurer;

    public RequestBodyCachingFilter(ServerCodecConfigurer configurer) {
        this.configurer = configurer;
    }

    @Override
    public Mono<Void> filter(ServerWebExchange exchange, WebFilterChain chain) {
        DataBuffer body = exchange.getAttributeOrDefault(CACHED_REQUEST_BODY_ATTR, null);

        if (body != null) {
            return chain.filter(exchange);
        }

        // Only cache if form data (because csrf inside only form data)
        if (requestContainsFormData(exchange)) {

            return ServerWebExchangeUtils.cacheRequestBody(exchange, (serverHttpRequest) -> {

                // don't mutate and build if same request object
                if (serverHttpRequest == exchange.getRequest()) {
                    return chain.filter(exchange);
                }

                return chain.filter(new FormDataServerWebExchangeDecorator(
                        exchange.mutate().request(serverHttpRequest).build(),
                        configurer));

            }).transformDeferred(call -> call.doFinally(t -> {
                /**
                 * See https://github.com/spring-cloud/spring-cloud-gateway/issues/1587
                 * Spring cacheRequestBody does not seem to release cached databuffer correctly and
                 * this leads to memory leak. Release the buffer manually after mono is completed.
                 */
                DataBuffer cachedBody = exchange.getAttributeOrDefault(CACHED_REQUEST_BODY_ATTR, null);
                if (cachedBody != null) {
                    DataBufferUtils.release(cachedBody);
                }
            }));
        }

        return chain.filter(exchange);
    }

    private static boolean requestContainsFormData(ServerWebExchange serverWebExchange) {
        var contentType = serverWebExchange.getRequest().getHeaders().getContentType();
        return APPLICATION_FORM_URLENCODED.isCompatibleWith(contentType);
    }

    private static class FormDataServerWebExchangeDecorator extends ServerWebExchangeDecorator {

        private static final Mono<MultiValueMap<String, String>> EMPTY_FORM_DATA =
                Mono.just(CollectionUtils.unmodifiableMultiValueMap(new LinkedMultiValueMap<String, String>(0))).cache();
        private static final ResolvableType FORM_DATA_TYPE =
                ResolvableType.forClassWithGenerics(MultiValueMap.class, String.class, String.class);

        private final ServerCodecConfigurer configurer;

        public FormDataServerWebExchangeDecorator(ServerWebExchange delegate, ServerCodecConfigurer configurer) {
            super(delegate);
            this.configurer = configurer;
        }

        @Override
        public Mono<MultiValueMap<String, String>> getFormData() {
            try {
                MediaType contentType = getDelegate().getRequest().getHeaders().getContentType();
                if (APPLICATION_FORM_URLENCODED.isCompatibleWith(contentType)) {
                    return ((HttpMessageReader<MultiValueMap<String, String>>) configurer.getReaders().stream()
                            .filter(reader -> reader.canRead(FORM_DATA_TYPE, APPLICATION_FORM_URLENCODED))
                            .findFirst()
                            .orElseThrow(() -> new IllegalStateException("No form data HttpMessageReader.")))
                            .readMono(FORM_DATA_TYPE, getDelegate().getRequest(), Hints.none())
                            .switchIfEmpty(EMPTY_FORM_DATA)
                            .cache();
                }
            } catch (InvalidMediaTypeException ex) {
                // Ignore
            }
            return EMPTY_FORM_DATA;
        }

    }

}

Edit: Now that I think about it, I think the problem with my initial fix in the earlier comment is that it creates a whole new ServerWebExchange so it might end up having the same attribute in the initial exchange and the exchange passed to the chain and only the initial exchange attribute is cleaned.

The filter we are using does not do this as it only mutates the initial exchange request. So we have to do exchange.mutate().request(serverHttpRequest).build() instead of new DefaultWebServerExchange. But now we arrive to the point why we need the FormDataServerWebExchangeDecorator. The initial exchange that is passed to this filter already has formDataMono field (and some filter such as CsrfWebFilter might use it) which has reference to the original body and not the decorated cached body. BUT when we decorate the original exchange here and override the getFormData the subsequent calls will use the decorated cached body and it will work. Probably something like this is the reasoning behind our implementation.. it's been a while so I might not remember all the details... 😅

In fact, Gateway has provided global org.springframework.cloud.gateway.filter.RemoveCachedBodyFilter to perform cleaning operations, which has the highest priority, and theoretically we do not need to manually clean up.

tianshuang avatar Jul 06 '22 02:07 tianshuang

In fact, Gateway has provided global org.springframework.cloud.gateway.filter.RemoveCachedBodyFilter to perform cleaning operations, which has the highest priority, and theoretically we do not need to manually clean up.

Yeap. But like I mentioned in https://github.com/spring-cloud/spring-cloud-gateway/issues/1587#issuecomment-862172641 that didn't seem to help when there is for example Csrf filter before the gateway chain and the RemoveCachedBodyFilter is in the gateway chain. If Csrf block the request it does not go through the gateway chain I think.

tuomoa avatar Jul 06 '22 08:07 tuomoa

@tianshuang Did you get it working by overriding the getMultipartData() similarly as we did with the getFormData() in the decorator?

tuomoa avatar Jul 14 '22 12:07 tuomoa