spring-cloud-gateway icon indicating copy to clipboard operation
spring-cloud-gateway copied to clipboard

Empty CloseWebSocketFragment leads to sending an invalid CloseStatus 1005

Open Johannes-Rost opened this issue 3 years ago • 12 comments

Recently we are seeing the following exception sporadically in our logs. It seems like org.springframework.web.reactive.socket.adapter.ReactorNettyWebSocketSession is not working correctly if the io.netty.handler.codec.http.websocketx.CloseWebSocketFrame is empty:

Stack Trace:
  reactor.core.Exceptions$ErrorCallbackNotImplemented: java.lang.IllegalArgumentException: WebSocket close status code does NOT comply with RFC-6455: 1005
Caused by: java.lang.IllegalArgumentException: WebSocket close status code does NOT comply with RFC-6455: 1005
        at io.netty.handler.codec.http.websocketx.CloseWebSocketFrame.requireValidStatusCode(CloseWebSocketFrame.java:209)
        at io.netty.handler.codec.http.websocketx.CloseWebSocketFrame.<init>(CloseWebSocketFrame.java:69)
        at reactor.netty.http.client.WebsocketClientOperations.sendClose(WebsocketClientOperations.java:213)
        at org.springframework.web.reactive.socket.adapter.ReactorNettyWebSocketSession.close(ReactorNettyWebSocketSession.java:124)
        at reactor.core.publisher.MonoFlatMap$FlatMapMain.onNext(MonoFlatMap.java:125)
        at reactor.core.publisher.FluxFilter$FilterSubscriber.onNext(FluxFilter.java:113)
        at reactor.core.publisher.FluxMap$MapConditionalSubscriber.onNext(FluxMap.java:220)
        at reactor.core.publisher.FluxFirstWithSignal$FirstEmittingSubscriber.onNext(FluxFirstWithSignal.java:330)
        at org.springframework.cloud.sleuth.instrument.reactor.ScopePassingSpanSubscriber.onNext(ScopePassingSpanSubscriber.java:88)
        at reactor.core.publisher.Operators$MonoInnerProducerBase.complete(Operators.java:2634)
        at reactor.core.publisher.SinkOneMulticast.tryEmitValue(SinkOneMulticast.java:70)
        at reactor.netty.http.server.WebsocketServerOperations.sendCloseNow(WebsocketServerOperations.java:258)
        at reactor.netty.http.server.WebsocketServerOperations.onInboundNext(WebsocketServerOperations.java:156)
        at reactor.netty.channel.ChannelOperationsHandler.channelRead(ChannelOperationsHandler.java:93)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
[...]

I already discussed this in detail in a bug report for Reactor Netty: https://github.com/reactor/reactor-netty/issues/1891 @violetagg was able to convince me that Reactor Netty is doing everything right at this point and that the problem lies in the way org.springframework.cloud.gateway.filter.WebsocketRoutingFilter uses ReactorNettyWebSocketSession

Johannes-Rost avatar Nov 24 '21 11:11 Johannes-Rost

Any chance to get feedback on this? We see this error quite frequently in our logs.

Johannes-Rost avatar Dec 07 '21 18:12 Johannes-Rost

Any news on this one?

Johannes-Rost avatar Dec 20 '21 09:12 Johannes-Rost

Would be great if we could get this triaged. Our logs get polluted by this stack trace quite often. Even though our websocket downstream service is working correctly

franzvezuli avatar Feb 17 '22 15:02 franzvezuli

I don't know if it's relevant for others, but I was able to fix this problem by increasing the maximum websocket message size. It seems the problem occurs when an error caused by a message too big is encountered by WebsocketRoutingFilter - or at least that seems to have been the case for me.

Anonymous-Coward avatar Feb 17 '22 19:02 Anonymous-Coward

I don't know if it's relevant for others, but I was able to fix this problem by increasing the maximum websocket message size. It seems the problem occurs when an error caused by a message too big is encountered by WebsocketRoutingFilter - or at least that seems to have been the case for me.

Thanks for that discovery. Can this be done through applycation.yml or is there a filter when setting up the RouteLocatorBuilder ?

franzvezuli avatar Feb 17 '22 19:02 franzvezuli

Reading over the reactor issue, I'm still not positive what needs to be done in gateway. @violetagg ?

spencergibb avatar Mar 02 '22 18:03 spencergibb

@spencergibb reactor.netty.http.websocket.WebsocketInbound#receiveCloseStatus returns the close statuses described in the specification (https://datatracker.ietf.org/doc/html/rfc6455#section-7.4.1). However not all of these statuses can be send with a close frame. Spring Cloud Gateway has to filter the close statuses in the code below before invoking CloseStatus.create(status.code(), status.reasonText()) https://github.com/spring-projects/spring-framework/blob/259bcd60fbbc5cdb8b230595a5004707f4c6ff23/spring-webflux/src/main/java/org/springframework/web/reactive/socket/adapter/ReactorNettyWebSocketSession.java#L127-L131

violetagg avatar Mar 02 '22 21:03 violetagg

We are also experiencing this log message.

So, by @franzvezuli 's comment, can anyone confirm that this is not affecting functionality?

jwedel avatar Mar 29 '22 12:03 jwedel

Hi @violetagg @Johannes-Rost @spencergibb We have replicated this issue by using spring-boot 2.5.12, spring-cloud 2020.0.5 and spring-cloud-gateway 3.1.1. The application is not being impacted at least in terms of Websockets communication, However, we would like to confirm if the Websockets finally are being closed. Can some one confirm? Also, the HttpTrafficHandler creates its own IdleTimeoutHandler only when the parameter idleTimeout is not null but we found it as null. Is there any default value?

hoglyrubio avatar May 03 '22 15:05 hoglyrubio

I'm still facing this issue in 2021.0.3 and too many logs generated in gateway side, i see you moved this bug to 2.21.0.4, hopefully it can be fixed ASAP.

hwy425 avatar Jul 15 '22 06:07 hwy425

Same on our side. We are facing this issue since we use the release train 2021 and too many alert pop on our side due to his log error. Hopefully this will be fixed asap.

axelczk avatar Aug 01 '22 14:08 axelczk

Any news on this one? This error is still polluting our logs.

Johannes-Rost avatar Sep 20 '22 09:09 Johannes-Rost

Is there any update on this issue?

RolandSall avatar Feb 09 '23 10:02 RolandSall

Yes is it possible to have a fix for this ?

axelczk avatar Feb 09 '23 12:02 axelczk

Hi, @spencergibb There is too many such error log, which seriously affects our troubleshooting. Is there a temporary solution to avoid this problem, such as bootstrap.yml settings or filter interception filling and StatusCode?

langxf avatar Feb 10 '23 19:02 langxf

Hi! Same here, lots of errors on my logs. Do you know something that could be done?

drando avatar Feb 12 '23 21:02 drando

I have tried upgrading to the newest gateway version of spring cloud. The same problem exists, to my knowledge, this is not affecting the behavior of the socket, can anyone confirm that the socket behavior is intact and it is just a log error issue?

RolandSall avatar Feb 15 '23 09:02 RolandSall

any alternative to suppress this error ?

github4sanjay avatar Feb 26 '23 06:02 github4sanjay