storm icon indicating copy to clipboard operation
storm copied to clipboard

[STORM-3751] NPE in WorkerState.transferLocalBatch

Open jira-importer opened this issue 4 years ago • 4 comments

Hello,

 

I've recently upgraded to Storm 2.2.0 and have been getting this error:

 

2021-03-07 04:36:51.061 o.a.s.m.n.StormServerHandler Netty-server-localhost-6700-worker-1 [ERROR] server errors in handling the request
java.lang.NullPointerException: null
        at org.apache.storm.daemon.worker.WorkerState.transferLocalBatch(WorkerState.java:543) ~[storm-client-2.2.0.jar:2.2.0]
        at org.apache.storm.messaging.DeserializingConnectionCallback.recv(DeserializingConnectionCallback.java:71) ~[storm-client-2.2.0.jar:2.2.0]
        at org.apache.storm.messaging.netty.Server.enqueue(Server.java:146) ~[storm-client-2.2.0.jar:2.2.0]
        at org.apache.storm.messaging.netty.Server.received(Server.java:264) ~[storm-client-2.2.0.jar:2.2.0]
        at org.apache.storm.messaging.netty.StormServerHandler.channelRead(StormServerHandler.java:51) ~[storm-client-2.2.0.jar:2.2.0]
        at org.apache.storm.shade.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [storm-shaded-deps-2.2.0.jar:2.2.0]
        at org.apache.storm.shade.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [storm-shaded-deps-2.2.0.jar:2.2.0]
        at org.apache.storm.shade.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) [storm-shaded-deps-2.2.0.jar:2.2.0]
        at org.apache.storm.shade.io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:323) [storm-shaded-deps-2.2.0.jar:2.2.0]
        at org.apache.storm.shade.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:297) [storm-shaded-deps-2.2.0.jar:2.2.0]
        at org.apache.storm.shade.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [storm-shaded-deps-2.2.0.jar:2.2.0]
        at org.apache.storm.shade.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [storm-shaded-deps-2.2.0.jar:2.2.0]
        at org.apache.storm.shade.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) [storm-shaded-deps-2.2.0.jar:2.2.0]
        at org.apache.storm.shade.io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1434) [storm-shaded-deps-2.2.0.jar:2.2.0]
        at org.apache.storm.shade.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [storm-shaded-deps-2.2.0.jar:2.2.0]
        at org.apache.storm.shade.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [storm-shaded-deps-2.2.0.jar:2.2.0]
        at org.apache.storm.shade.io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:965) [storm-shaded-deps-2.2.0.jar:2.2.0]
        at org.apache.storm.shade.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163) [storm-shaded-deps-2.2.0.jar:2.2.0]
        at org.apache.storm.shade.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:644) [storm-shaded-deps-2.2.0.jar:2.2.0]
        at org.apache.storm.shade.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:579) [storm-shaded-deps-2.2.0.jar:2.2.0]
        at org.apache.storm.shade.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:496) [storm-shaded-deps-2.2.0.jar:2.2.0]
        at org.apache.storm.shade.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:458) [storm-shaded-deps-2.2.0.jar:2.2.0]
        at org.apache.storm.shade.io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:897) [storm-shaded-deps-2.2.0.jar:2.2.0]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_272]
2021-03-07 04:36:51.061 o.a.s.m.n.StormServerHandler Netty-server-localhost-6700-worker-1 [INFO] Received error in netty thread.. terminating server... 

 
This issue happens every 20-30 minutes and causes the workers to die/restart.

It seems related to https://issues.apache.org/jira/browse/STORM-3141 but seems to have been fixed in 2.0. 

I am happy to provide more information but at the moment am unsure of what is relevant.

I have a suspicion that this is related to load-aware localOrShuffleGrouping ("LoadAwareShuffleGrouping") because this issue seems to have started when I switched the Grouping, but again, not sure if it's actually related.


Originally reported by cozos, imported from: NPE in WorkerState.transferLocalBatch
  • status: Open
  • priority: Major
  • resolution: Unresolved
  • imported: 2025-01-24

jira-importer avatar Mar 07 '21 06:03 jira-importer

s4nch3z:

I confirm, the same issue happens in the cluster every 1-2 weeks. On one occasion it brought the supervisors cluster down completely or into constantly degrading state. The only remedy for now is to refresh the whole supervisor set

jira-importer avatar Mar 17 '21 19:03 jira-importer

cozos:

s4nch3z Any guesses on what triggered it on your side?

jira-importer avatar Apr 10 '21 08:04 jira-importer

JIRAUSER288735:

We are also running into this same error, with the same stack trace and same line number. This is causing workers to get restarted. Has anyone been able to figure out what's going on yet?

I see that STORM-3141 is fixed. However, that has a different stack trace; error occurred on a different line number. So I believe that original issue was fixed, but now leads to this new error. So this may be a different issue?

jira-importer avatar Apr 27 '22 14:04 jira-importer

JIRAUSER293173:

This is happening in our cluster (2.4.0) due to a Qualys scan. Does anyone have any thoughts on workarounds (aside from blocking Qualys)?

It seems related to https://issues.apache.org/jira/browse/STORM-1642?page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel&focusedCommentId=15467776#comment-15467776

2022-07-20 17:31:18.311 o.a.s.m.n.StormServerHandler Netty-server-localhost-6700-worker-1 [ERROR] server errors in handling the request from /[Qualys IP]:[Various Ports]
java.lang.NullPointerException: null
	at com.esotericsoftware.kryo.io.Input.setBuffer(Input.java:81) ~[kryo-3.0.3.jar:?]
	at org.apache.storm.serialization.KryoTupleDeserializer.deserialize(KryoTupleDeserializer.java:39) ~[storm-client-2.4.0.jar:2.4.0]
	at org.apache.storm.messaging.DeserializingConnectionCallback.recv(DeserializingConnectionCallback.java:66) ~[storm-client-2.4.0.jar:2.4.0]
	at org.apache.storm.messaging.netty.Server.enqueue(Server.java:149) ~[storm-client-2.4.0.jar:2.4.0]
	at org.apache.storm.messaging.netty.Server.received(Server.java:277) ~[storm-client-2.4.0.jar:2.4.0]
	at org.apache.storm.messaging.netty.StormServerHandler.channelRead(StormServerHandler.java:48) ~[storm-client-2.4.0.jar:2.4.0]
	at org.apache.storm.shade.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [storm-shaded-deps-2.4.0.jar:2.4.0]
	at org.apache.storm.shade.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [storm-shaded-deps-2.4.0.jar:2.4.0]
	at org.apache.storm.shade.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) [storm-shaded-deps-2.4.0.jar:2.4.0]
	at org.apache.storm.shade.io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:323) [storm-shaded-deps-2.4.0.jar:2.4.0]
	at org.apache.storm.shade.io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:310) [storm-shaded-deps-2.4.0.jar:2.4.0]
	at org.apache.storm.shade.io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:426) [storm-shaded-deps-2.4.0.jar:2.4.0]
	at org.apache.storm.shade.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:278) [storm-shaded-deps-2.4.0.jar:2.4.0]
	at org.apache.storm.shade.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [storm-shaded-deps-2.4.0.jar:2.4.0]
	at org.apache.storm.shade.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [storm-shaded-deps-2.4.0.jar:2.4.0]
	at org.apache.storm.shade.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) [storm-shaded-deps-2.4.0.jar:2.4.0]
	at org.apache.storm.shade.io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1434) [storm-shaded-deps-2.4.0.jar:2.4.0]
	at org.apache.storm.shade.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [storm-shaded-deps-2.4.0.jar:2.4.0]
	at org.apache.storm.shade.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [storm-shaded-deps-2.4.0.jar:2.4.0]
	at org.apache.storm.shade.io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:965) [storm-shaded-deps-2.4.0.jar:2.4.0]
	at org.apache.storm.shade.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163) [storm-shaded-deps-2.4.0.jar:2.4.0]
	at org.apache.storm.shade.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:644) [storm-shaded-deps-2.4.0.jar:2.4.0]
	at org.apache.storm.shade.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:579) [storm-shaded-deps-2.4.0.jar:2.4.0]
	at org.apache.storm.shade.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:496) [storm-shaded-deps-2.4.0.jar:2.4.0]
	at org.apache.storm.shade.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:458) [storm-shaded-deps-2.4.0.jar:2.4.0]
	at org.apache.storm.shade.io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:897) [storm-shaded-deps-2.4.0.jar:2.4.0]
	at java.lang.Thread.run(Thread.java:750) [?:1.8.0_332] 

jira-importer avatar Jul 21 '22 09:07 jira-importer